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Editor’s Statement 


A large body of mathematics consists of facts that can be presented and 
described much like any other natural phenomenon. These facts, at times 
explicitly brought out as theorems, at other times concealed within a proof, 
make up most of the applications of mathematics, and are the most likely 
to survive changes of style and of interest. 

This ENCYCLOPEDIA will attempt to present the factual body of all 
mathematics. Clarity of exposition, accessibility to the non-specialist, and a 
thorough bibliography are required of each author. Volumes will appear in 
no particular order, but will be organized into sections, each one compris- 
ing a recognizable branch of present-day mathematics. Numbers of 
volumes and sections will be reconsidered as times and needs change. 

It is hoped that this enterprise will make mathematics more widely used 
where it is needed, and more accessible in fields in which it can be applied 
but where it has not yet penetrated because of insufficient information. 


GIAN-CARLO ROTA 
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Foreword 


Our society faces difficult problems, for instance, fairly allocating scarce 
resources, making good societal decisions about energy use and pollution, 
reasonably disseminating urban and social services, and understanding 
and eliminating social inequalities. These problems involve in an important 
way issues in the social sciences. Increasingly, mathematics is being used, 
at least in small ways, to tackle these problems, and also to develop the 
necessary underlying principles of such fields as sociology, psychology, 
and political science. 

This volume is the first in the section, Mathematics and the Social 
Sciences. The section is expected to present mathematical treatments of 
social scientific problems as well as theories and mathematical tools, 
techniques, and questions motivated by problems of the social sciences. It 
is anticipated that this classification will concern itself with such funda- 
mental topics as learning theory, perception, signal detection, scaling and 
measurement, social networks, social mobility theory, voting behavior, 
social choice, and utility theory. At the same time, the section will concern 
itself with applications to problems of society, and deal with such problems 
as environment, transportation, urban affairs, and energy, from a societal 
point of view. 

The problems of the social sciences in general, and the problems facing 
society in particular, are extremely complex. We should not expect too 
much of mathematics when it comes to solving these problems. On the 
other hand, mathematics, as the language of science, has a very important 
role to play. As the most precise language ever invented by man, mathe- 
matics is a tool which can be used to carefully formulate social scientific 
problems, give us insight into the nature of these problems, and suggest 
potential approaches. At the same time, problems of the social sciences can 
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xviii Foreword 


be and have been the stimulus for the development of new mathematics, 
which is often very interesting in its own right and is, as yet, not very well 
known even in the mathematical community. Because the social sciences 
are a stimulus to new mathematics, this series is concerned with mathemat- 
ics and the social sciences, not just with mathematics in the social sciences. 

If mathematics is the language of science, perhaps one of the crucial 
reasons is that it helps us to measure things. Since the ability to measure is 
critical to the development of science and to the solution of many prob- 
lems, it is appropriate that the first volume in this section be concerned 
with the theory of measurement. This volume is concerned with founda- 
tions of measurement, in particular with putting measurement in the social 
sciences on a firm mathematical foundation. It takes a very broad point of 
view of the nature of measurement in particular and of mathematical 
treatment of the social sciences in general. The volume takes the attitude 
that treating a problem mathematically—even performing measure- 
ment—does not require the assignment of numbers. Rather, it involves the 
use of precisely defined mathematical objects and relations among them to 
reflect empirical objects and observed relations among these objects. This 
attitude will be reflected in other volumes in this classification as well. 

This section is begun while the jury is still out on the role of mathemat- 
ics vis-a-vis the social sciences. There can be no doubt that mathematics is 
already finding widespread use in the social sciences. It is hoped that this 
will lead to some really important breakthroughs in the understanding of 
social and societal problems. At the same time, these problems have 
already been a stimulus to the development of new mathematics, and 
should continue to be in the future. 


FRED S. ROBERTS 
General Editor, Section on 
Mathematics and the Social Sciences 


Preface 


1 Measurement Theory 


There is a large body of research work in a gray area which seems to 
have no disciplinary home; it can be called measurement theory. This 
work has been performed by philosophers of science, physicists, psycholo- 
gists, economists, mathematicians, and others. In the past several decades, 
much of this work has been stimulated by the need to put measurement in 
the social sciences on a firm foundation. As well as being closely tied to 
applications, measurement theory has a very interesting and serious 
mathematical component, which, surprisingly, has escaped the attention of 
most of the mathematical community. 

This book presents an introduction to measurement theory from a 
representational point of view. The emphasis is on putting measurement in 
the social and behavioral sciences on a firm foundation, and the applica- 
tions will be chosen from a variety of problems in decision theory, 
economics, psychophysics, policy science, etc. The purpose of this book is 
to present an introduction to the theory of measurement in a form 
appropriate for the nonspecialist. I hope that both mathematics students 
and practicing mathematicians with no prior exposure to the subject will 
find this material interesting, both as mathematics in its own right and 
because of its applications. I hope, indeed, that a number of mathemati- 
cians will find this subject interesting enough to solve some of the open 
problems posed in the text. I also hope that nonmathematicians with 
sufficient mathematical background will find the work thought-provoking 
and useful. I believe that the results, especially those on scale type, 
meaningfulness, organization of data, etc., should be of interest to psychol- 
ogists, economists, statisticians, philosophers, policymakers, and others. 
The results on decisionmaking and utility are potentially of interest to 
executives, policy advisors, managers, etc. The theory of measurement has, 
I believe, some hope of assisting some social scientists in the construction 
of theories and other social scientists in organizing data, reporting observa- 
tions, and reasoning about the phenomena they study. The theory also has 
the potential to assist decisionmakers in making better decisions about 
such problems as energy, transportation, pollution, and public health. 
When systematically applied to such problems, techniques of measurement 
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as discussed in this book can lead to an intuitive or qualitative understand- 
ing, which is often more important than the formal results obtained. 
(Keeney and Raiffa [1976] make exactly this point in their very interesting 
book.) 


2 Other Books on the Subject 


This material has benefited from earlier books on measurement theory 
with much the same philosophy, in particular the books by Pfanzag]l [1968] 
and by Krantz et al. [1971]. The work does not intend to approach the 
scope and depth of the latter book, which is highly recommended to those 
who wish to get further into the field of measurement. Rather, this work is 
written at an introductory level, with an attempt to mention a variety of 
topics and discuss their applications. It is aimed at introducing the nonspe- 
cialist to the problems with which measurement theory is concerned, 
the mathematical concepts with which measurement theory deals, the 
questions that measurement theorists ask, and the applications that 
measurement theory has. The book is also concerned with applications of 
measurement-theoretic techniques to decisionmaking, and to problems of 
public policy and society. It is in the increased emphasis on these sorts of 
applications that the book differs from those mentioned earlier, and falls 
closer to the books by Raiffa [1968] and by Keeney and Raiffa [1976]. 
Finally, in its emphasis on utility, the book has been influenced by such 
works as Fishburn [1970]. 


3 Use as a Textbook 


The material in this book has been used for several undergraduate and 
graduate courses in mathematics at Rutgers. The course for under- 
graduates was a course in mathematical models in the social sciences, 
taught at the junior level, and used measurement theory as one topic. The 
material covered was that in Chapters 1 and 2 and Sections 3.1 and 6.1, 
with most proofs omitted or simplified. The graduate course has been 
given several times, at a level appropriate for first-year graduate students 
in mathematics, and covers most of the book. The order of presentation of 
topics varies from time to time, as the material in later chapters is 
relatively independent and can be presented in different orders. For 
example, I often present Chapter 6 immediately after Chapter 3. (See 
Section 6 of the Introduction for a discussion of interdependencies of the 
chapters.) A course for nonmathematics students could be given with this 
material if many of the proofs were omitted or if some proofs were 
included and students with especially strong mathematical background 
were enrolled. 
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Specific prerequisites for any of these courses are few: elementary 
concepts of sets (union, intersection, etc.), logical notation and principles 
of inference (implication, quantification, contrapositive, etc.), and proper- 
ties of the real number system (order, Archimedean, etc.). However, the 
presentation will be on much too high a level for most students without the 
sophistication of undergraduate mathematics beyond the calculus. Parts of 
the book use more difficult mathematics: modern algebra (elementary 
group theory), notions of density, continuity, and metrics from analysis, 
and point set topology. However, it is easy enough to skip these parts, and 
they are usually designated as candidates for skipping. As for nonmathe- 
matical prerequisites, it is hard to specify them. Certainly the reader is not 
expected to be an expert in any of the social sciences. Different sections 
will use terminology and concepts from physics, psychology, economics, 
environmental science, etc. It will probably be necessary for the reader to 
look up some background material from time to time. Indeed, this is to be 
encouraged. 

The exercises form an integral part of this book. They review ideas 
presented in the text, present concrete examples and generalizations, and 
introduce new material. Some of the exercises are of a routine mathemati- 
cal nature, while others emphasize applications or ask the reader to 
speculate about the applicability of some abstract idea or the reasonable- 
ness of some assumption. For these latter kinds of exercises, of course, 
there is not necessarily a “right” answer. Indeed, the same can be said 
about many of the potential applications of the theory of measurement 
discussed in this book. On many occasions, rather than provide a “right” 
answer, measurement theory will, I hope, provide the user with intuition, 
insight, and understanding about some phenomenon he is trying to study 
or some complex decision he is trying to make. 
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Introduction 


1 Measurement 


A major difference between a “well-developed” science such as physics 
and some of the less “well-developed” sciences such as psychology or 
sociology is the degree to which things are measured. In this volume, we 
develop a theory of measurement that can act as a foundation for measure- 
ment in the social and behavioral sciences. Starting with such classical 
measurement concepts of the physical sciences as temperature and mass, 
we extend a theory of measurement to the social sciences. We discuss the 
measurement of preference, loudness, brightness, intelligence, and so on. 
We also apply measurenient to such societal problems as air and noise 
pollution, weather forecasting, and public health, and comment on the 
development of pollution indices and consumer price indices. 

Throughout, we apply the results to decisionmaking. The decisionmak- 
ing applications deal with transportation, consumer behavior, environmen- 
tal problems, energy use, and medicine, as well as with laboratory situa- 
tions involving human and animal subjects. 

In this introduction, we try to give the reader a quick preview of the 
contents and organization of the book, and of the problems we shall 
address. The reader might prefer to read the introduction rather quickly 
the first time, and to return to it later. 

The following questions are some of those we shall ask. A few of these 
are stated here in very general terms, and of course we shall try to be more 
specific in what follows. 


1. What does it mean to ineasure preference, likes and dislikes, etc.? 
2. When does it make sense to say goal a is twice as important as goal 
b? Or twice as worthwhile? 
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3. Does it make sense to assert that the average IQ of one group of 
individuals is twice the average IQ of a second group? 

4. Is it possible to measure air pollution with one index that takes 
account of many different pollutants? If so, does it make sense to assert 
that the pollution level today is 20% lower than it was yesterday? 

5. Is it meaningful to assert that the consumer price index has in- 
creased by 20% in a given period? 

6. How can we quantitatively relate subjective judginents of loudness 
of a sound to the physical intensity of the sound? And how can we use 
such quantitative relationships to develop indices of noise pollution? 

7. How can we use expert judges to judge the relative importance, or 
significance, or merit of alternative candidates, and then combine the 
judgments of the experts into one measure of importance, significance, or 
merit? 

8. How can we choose between two different treatments for a disease, 
given that we are not sure exactly what the outcomes of the treatments will 
be? 

9. Can subjective judgments that one event is more likely to occur than 
another be quantified? 

10. Is measurement still possible if judgments of preference, relative 
importance, loudness, etc., are inconsistent? 


At an early stage of scientific development, measurement is usually 
performed at only the crudest level, that of classification. Some philoso- 
phers of science (e.g., Torgerson [1958]) do not even wish to call this 
measurement. What are the advantages of performing measurement that 
goes beyond simple classification? Hempel [1952] describes some of these. 
First, if we can measure things, we can begin to differentiate more than we 
can by simply classifying. For example, we can do more than simply 
distinguishing between warm objects and cold ones; we can assign degrees 
of warmth. Greater descriptive flexibility leads to greater flexibility in the 
formulation of general laws. (One should imagine trying to state the laws 
of physics using only classifications!) Measurement is usually performed 
by assigning numbers—though we shall argue below that this is not a 
prerequisite for measurement. Assignment of numbers makes possible the 
application of the concepts and theories of mathematics. General laws can 
now be stated in mathematical language—for example, as formal relations 
among quantities. Mathematical tools for analysis of numbers can help us 
to reason about objects and their properties, and to deduce general 
principles describing these properties. Thus, the existence of and experi- 
ence with centuries of mathematical reasoning is a large part of the reason 
we find it useful to measure things. 

We shall study two quite different types of measurement, fundamental 
measurement and derived measurement. (See Table 1.) Fundamental 
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Table 1. Measurement 
A. Types of Measurement 


1. Fundamental 
2. Derived 


B. Problems in Fundamental Measurement 


1. Representation 
2. Uniqueness 


C. Axioms in a Representation Theorem 


1. Prescriptive or normative 
2. Descriptive 


measurement, as we shall describe it, takes place at an early stage of 
scientific development, when several fundamental concepts are measured 
for the first time. Mass, temperature, and volume are fundamental 
measures. Derived measurement takes place later, when some concepts 
have already been measured, and new measures are defined in terms of 
existing ones. Density can be thought of as a derived measure, defined as 
mass divided by volume. (Derived measurement is usually a relative 
matter. As we shall point out, the same scale—for example, density—can 
be introduced either as a derived scale or as a fundamental one.) Much of 
this work will be concerned with fundamental measurement. However, 
large parts of Chapters 2, 4, and 6 deal with derived measurement. 

In the development of a scientific discipline, fundamental measurement 
is not usually performed in as formalistic a way as we describe in this 
volume. We usually do not attempt to describe an actual process that is 
undergone as techniques of measurement are developed. We are not 
interested in a measuring apparatus and in the interaction between the 
apparatus and the objects being measured. Rather, we attempt to describe 
how to put measurement on a firm, well-defined foundation. A number of 
the results, however, are of potential practical significance, and we shall 
attempt to mention practical techniques for actually computing measures 
or scales whenever possible. Once measurement has been put on a firm 
foundation, we develop a variety of tools for analyzing statements made in 
terms of scale values. These techniques have immediate application in a 
variety of practical problems. 

Putting measurement on a firm foundation is not a terribly important 
activity in the modern-day physical sciences; many physicists would prob- 
ably not consider it physics, but only “philosophy of physics.” Practicing 
physicists usually take measurement for granted, and anyway measure- 
ments are usually based on powerful, well-established theories. In the 
social sciences, on the other hand, much present activity can be cate- 
gorized as the search for appropriate scales of measurement to describe 
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behavior, aid in decisions, etc. Putting measurement on a firm foundation 
can potentially play a very important role in the development of the social 
sciences. 

Since much of this volume is devoted to putting measurement on a firm 
foundation, it is not surprising that much of the discussion is axiomatic in 
nature. We try to develop axioms or conditions under which measurement 
is possible. These are usually conditions about an individual’s judgments, 
preferences, reactions, and so on. For example, we shall show that prefer- 
ences can be measured in a consistent way provided that they satisfy two 
axioms: 


(A) If you prefer a to b, you do not prefer 5 to a. 
(B) If you do not prefer a to b, and do not prefer b to c, then you do 
not prefer a to c. 


Such axioms can be looked at in two ways. The prescriptive or normative 
interpretation looks at the axioms as conditions of rationality. A truly 
rational man, given ideal conditions (unlimited computational ability, 
unlimited resources, etc.), should make judgments that satisfy the axioms, 
or else he is not acting rationally. Theories can be built based on the 
definition of a rational man, and procedures for a rational man to make 
decisions based on his judgments can be developed.* A second interpreta- 
tion is that these axioms are descriptive. They give conditions on behavior 
which, if satisfied, allow measurement to take place. Whether or not 
measurement can take place then depends on whether or not an indi- 
vidual’s judgments satisfy the axioms, and we hope that this is a testable 
question. Whenever possible in this volume, we try to discuss tests of the 
axioms presented. Although it is not fair to generalize, it is often true that 
the prescriptive axiomatic theories of measurement in the social sciences 
have been developed in the economic literature, and the descriptive theo- 
ries in the psychological literature. The theories of physical measurement 
in general are simultaneously prescriptive and descriptive, and have largely 
philosophical significance. However, both the prescriptive and descriptive 
theories of measurement in the social sciences propose new theories and 
lead to new laws (the axioms), suggest experiments to distinguish between 
these laws, and give rise to practical techniques of measurement where 
none was possible before. (See Krantz [1968, Section 2.1] for a more 
detailed discussion of this point.) 

The problem of finding axioms under which measurement is possible 
will be called in this volume the representation problem. Once measurement 
has been accomplished, we also consider the uniqueness problem: How 


*Keeney and Raiffa [1976] distinguish between the normative and the prescriptive. The 
normative interpretation refers to an “idealized, ...superrational being with an all-powering 
intellect,” while the prescriptive refers to “normally intelligent people who want to think hard 
and systematically...” We shall not make this distinction. 
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unique is the resulting measure or scale? This problem will be very 
important in telling us what kinds of comparisons and what kinds of 
mathematical manipulations are possible with the measures obtained. For 
example, we shall ask whether it is meaningful to say that one group’s 
average IQ is twice that of a second group’s, or that the average air 
pollution level in one city is greater than that in a second city. We shall 
discuss the uniqueness problem in depth in Chapter 2, and return to it 
often. 


2 The Measurement Literature 


Our approach to measurement theory follows very closely that of Scott 
and Suppes [1958], Suppes and Zinnes [1963], Pfanzagl [1968], Krantz 
[1968], and Krantz et al. [1971]. There is a long-standing literature on the 
nature of measurement, and the reader might wish to consult some of this 
literature for other points of view. Much of this literature concerns itself 
with measurement in the physical sciences. Early influential books were 
written by Campbell [1920, 1928, 1938]. Campbell and such writers as 
Helmholtz [1887], Cohen and Nagel [1934], Guild [1938], Reese [1943], and 
Ellis [1966] did not accept measurement unless it involved some sort of 
concatenation or addition operation of the type we shall describe in 
Section 3.2. Although this approach to measurement was quite appropriate 
for physics, it was not really broad enough to encompass the measurement 
problems of the social sciences. However, as late as 1940, a committee of 
the British Association for the Advancement of Science questioned specifi- 
cally whether psychologists such as S. S. Stevens, who were measuring 
human sensations such as loudness, were really performing measurement, 
since they used no concept of addition (Final Report of the British 
Association for the Advancement of Science [1940]). Our approach is 
broader than that of the classical writers, and considers a measurement 
theory that can act as a basic foundation for measurement in the social 
sciences. Some other important references on measurement theory, closer 
to our point of view, are Stevens [1946, 1951, 1959, 1968] and Adams 
[1965]. Stevens was the first to observe that uniqueness of a measurement 
assignment defined scale type, in the sense we describe in Section 2.3. 
Although chemistry and biology have used scales not much different from 
those of physics, the social sciences, in measurement of preference, intelli- 
gence, etc., have given rise to entirely new types of scales, with somewhat 
different character. We shall explore these in detail. 


3 Decisionmaking 
A major application of measurement theory is to problems of decision- 


making. Various authors (for example, Luce and Raiffa [1957]) have 
classified decisionmaking problems in several ways. (See Table 2.) The first 
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Table 2. Classification of Decisionmaking Problems 
A. Who Makes the Decision? 


1. Individual 
2. Group 


B. How Much Information about Consequences of Actions Is Known? 


1. Certainty 
2. Risk 
3. Uncertainty 


distinction to make is whether the decision is being made by an individual 
or a group. Some groups act as individuals; the difference is whether 
members of the group are expressing their own opinions from among 
which the group must choose. In this volume, we shall be almost exclu- 
sively concerned with the individual decisionmaker. For good summaries 
of the group decisionmaking problem, see Luce and Raiffa [1957], Sen 
[1970], or Fishburn [1972]. A classic reference is Arrow [1951]. More 
specifically for the decisionmaking problem involving elections, see Black 
[1958], Farquharson [1969], and Riker and Ordeshook [1973]. 

A second distinction between problems of decisionmaking involves how 
much certainty there is about outcomes of various actions. If we are trying 
to choose among alternative acts, we say that we are in a situation of 
certainty if for each act there is exactly one consequence. We are in a 
situation of risk if for each act there is a set of possible consequences, none 
of which occurs with certainty, but each of which occurs with a known 
probability. Finally, we are in a situation of uncertainty if the probabilities 
that consequences will occur are unknown. In practice many decisionmak- 
ing problems involve a mixture of risk and uncertainty (or even of 
certainty, risk, and uncertainty). 

In this volume, we shall concentrate primarily on decisions involving 
certainty. However, in Chapter 7 we shall discuss decisionmaking under 
risk and under uncertainty. In Chapter 8, we shall discuss how to measure 
the probability of various outcomes, given subjective judgments about 
which of two outcomes is more probable. 


4 Utility 


If decisions are being made in a situation of certainty, then we often 
choose that act whose certain consequence maximizes (or minimizes) some 
index—for example, a measure of value, worth, satisfaction, or utility. The 
notion of utility goes back at least to the eighteenth century. Much of the 
original interest in this concept goes back to Jeremy Bentham, who defines 
utility in the first chapter of The Principles of Morals and Legislation 
(1789), as follows: 
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By utility is meant that property in any object, whereby it tends to produce benefit, 
advantage, pleasure, good, or happiness (all this in the present case comes to the same 
thing), or (what comes again to the same thing) to prevent the happening of mischief, 
pain, evil, or unhappiness to the party whose interest is considered: if that party be the 
community in general, then the happiness of the community; if a particular individual, 
then the happiness of that individual. 


Bentham formulated procedures for measuring utility, for he thought 
societies should strive for “the greatest good for the greatest number”— 
that is, maximum utility. In this volume, we shall usually look at the 
measurement of utility as a representation problem and list various axiom 
systems sufficient for the existence of utility functions, measures of utility. 
In the early applications of utility theory in economics, there was an 
emphasis by economists such as Walras and Jevons on obtaining utility 
functions that were additive in the sense that the utility of the combination 
of two objects was the sum of the individual utilities of the two objects. 
Such a utility function is called cardinal. (More generally, a utility function 
will be called cardinal if it is unique up to a positive linear transformation.) 
We shall discuss axioms for cardinal utility in Chapters 3 and 5. In the late 
nineteenth century, Edgeworth [1881] questioned the assumption of addi- 
tivity. In the early twentieth century, the economist Pareto [1906] showed 
that much of economic theory depends only on the assumption that the 
utility function is ordinal—that is, that the utilities can be used only to 
decide which of two objects has a higher value, and addition in particular 
does not necessarily make sense. We shall discuss ordinal utility in Section 
3.1. 

Luce and Suppes [1965] classify theories of utility in several ways. (See 
Table 3.) A similar classification applies to theories of measurement in 
general. First, these theories can involve decisionmaking under certainty, 
risk, or uncertainty. We have already discussed these distinctions. A 
second distinction is whether the theories are algebraic and deterministic or 
probabilistic. Most of the traditional utility theories are algebraic, and the 


Table 3. Classification of Theories of Utility 


A. How Much Information about Consequences of Actions Is Known 
to the Decisionmakers? 


1. Certainty 
2. Risk 
3. Uncertainty 
B. What Kind of Representation Theorems? 
1. Algebraic and deterministic 
2. Probabilistic 
C. What Judginents Is Measurement of Utility Based on? 


1. Simple choices 
2. Rankings or choices from a set 
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representation theorems state axioms that are very algebraic in nature. For 
example, the axioms for cardinal utility functions stated in Section 3.2 say 
that preference and combination of objects define a certain kind of 
ordered semigroup. (Sometimes, if there is an interest in continuous utility 
functions, topological axioms must be added.) Our basic formulation of 
fundamental measurement is algebraic, starting with the idea of a relation 
and an operation (Chapter 1), and then basing measurement on a rela- 
tional system consisting of certain empirical relations and operations 
(Section 1.8). Sometimes it is useful to modify algebraic theories. This is 
the case when the fundamental judgments that representation theorems 
describe are made inconsistently or made consistently, but according to 
some statistical regularity. When data is inconsistent or there is no discern- 
ible deterministic pattern available, the algebraic theories must be replaced 
by probabilistic theories, which are built around this more random data. 
Falmagne [1976] argues that we shall (almost) always have to replace 
algebraic theories by probabilistic or random ones, at least in applications 
to the behavioral sciences. He is concerned with developing probabilistic 
analogues of the traditional algebraic theories of fundamental measure- 
ment. We discuss some probabilistic theories in Section 6.2, but otherwise 
our basic approach is algebraic. 

Both the algebraic and probabilistic theories of utility can take two 
forms. The first form assumes that a utility function can be derived simply 
on the basis of simple choices among pairs of alternatives or objects. The 
second asks for a ranking among elements in each set of alternatives, or 
choice of a best element from the set, before deriving a utility function. We 
shall discuss only simple choice theories in this volume. Luce and Suppes 
[1965, Section 6] and Krantz er al. [to appear] summarize some probabilistic 
ranking theories. 

Some excellent surveys of utility theory and decisionmaking are Fish- 
burn [1968, 1970] and Luce and Suppes [1965] and, for utility with 
multi-attributed alternatives, Farquhar [1977] and Keeney and Raiffa 
[1976]. Aoki, Chipman, and Fishburn [1971] list many references on 
preferences, utility, and demand. Other references on utility theory include 
Adams [1960], Arrow [1951], Chipman [1960], Debreu [1959], Edwards 
[1954, 1961], Fischer and Edwards [1973], Fishburn [1964], Luce and 
Raiffa [1957], Majumdar [1958], Savage [1954], Slovic and Lichtenstein 
[1971], Stigler [1950], and von Neumann and Morgenstern [1944]. For a 
more complete list of references, see Luce and Suppes [1965] or Fishburn 
[1970]. 


5 Mathematics 


The study of measurement, decisionmaking, and utility has given rise to 
interesting new mathematical results and questions, most of which are not 
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very well known among mathematicians. A major purpose of this book is 
to organize a class of these mathematical results and to introduce the 
mathematically trained reader to them. 

Most of the mathematics in this book is distinctly algebraic in flavor. It 
deals with ordered algebraic systems and homomorphic mappings of one 
such system into another. Many of these systems are finite, so there is a 
discrete flavor to much of the nathematics. Fundamental properties of the 
real number system are used throughout, and many of the results are 
related to branches of logic such as set theory and foundations of geome- 
try. A variety of results and tools of an analytic or topological nature are 
scattered throughout. For example, solutions of certain functional equa- 
tions are studied in Chapter 4. 

Finally, much of the mathematics described here has a flavor of its own. 
It is to be expected that, as more mathematicians become interested in 
problems of the social sciences, new forms of mathematics will have to be 
developed to solve these problems. Hopefully a work like this one will 
stimulate some mathematically trained individuals to become involved in 
such developments. 


6 Organization of the Book 


Since relational systems form a basis for much of the theory of measure- 
ment, I have chosen to include an introductory chapter on the theory of 
relations, which includes most of the relation-theoretic concepts needed 
later on. The reader familiar with this theory can skip much of Chapter 1, 
though he should read Theorems 1.2, 1.3, and 1.4 and Section 1.8. 

Chapter 2 introduces fundamental and derived measurement and defines 
the two basic problems of fundamental measurement—the representation 
problem and the uniqueness problem. It introduces scale type and uses 
scale type to study what statements involving scales are meaningful. It 
applies the results to a variety of practical problems, such as the making of 
index numbers (consumer price, consumer confidence) and the measure- 
ment of air pollution. Chapter 3 illustrates fundamental measurement by 
giving three basic representation theorems, stating conditions under which 
measurement can be performed; one is for ordinal measurement, one for 
extensive measurement, and one for difference measurement. In the utility 
interpretation, the first theorem gives conditions for the existence of 
ordinal utility functions, and the second and third conditions for the 
existence of cardinal utility functions. Chapter 4 is an applications chapter. 
It presents the theory of psychophysical scaling, and applies the results 
about scale type and fundamental and derived measurement to this theory. 
It concentrates on the measurement of loudness, and shows how one might 
derive a measure of loudness from known physical scales of intensity of a 
sound. 
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Chapters 5 through 8 introduce various complications into the measure- 
ment picture. In Chapter 5, we study “complicated” or multidimensional 
alternatives. A new kind of fundamental measurement, called conjoint 
measurement, is introduced. The results have application to combinations 
of psychological factors such as drive and incentive, to binaural additivity 
of loudness, to mental testing problems, and to measuring discomfort due 
to different weather factors. They also have applications to problems of 
urban services, allocation of funds for education, treatment of medical 
problems, design of large public facilities such as airports, etc. In Chapter 
6, we ask what to do when it is not even possible to perform ordinal 
measurement in the sense of Section 3.1—that is, when the necessary 
conditions for ordinal measurement are violated. We widen our scope, and 
introduce the idea of measurement without numbers or measurement when 
the basic data is inconsistent. We study Luce’s fundamental idea of a 
semiorder and then give examples of probabilistic theories of measure- 
ment. A variety of applications to data from pair comparison experiments 
are presented. 

Chapter 7 discusses the problem of decisionmaking under risk or uncer- 
tainty. It introduces the famous expected utility hypothesis, which goes 
back to Bernoulli [1738], and which says that a decisionmaker chooses that 
action which maxiinizes his expected utility. Accepting this as a prescrip- 
tion, we give applications to decisionmaking problems from such fields as 
transportation, medicine, and public health, and to calculation of utility. A 
number of descriptive utility theorists believe that although we do not 
consciously calculate expected utilities, we act as if we are maximizing 
expected utility. We present axiom systems which give conditions on 
choices among acts with risky or uncertain consequences sufficient to 
guarantee that these choices are made as if expected utility were being 
maximized. 

In a decisionmaking situation under uncertainty, and in other situations 
as well, it is sometimes useful or necessary to be able to calculate 
probabilities that reflect our judgments that certain events are subjectively 
more probable than others. In Chapter 8, we discuss the measurement of 
subjective probability. 

Chapters | and 2 and Section 3.1 form a groundwork for the rest of the 
book. Much of the rest of the book can be read in any order. For example, 
Sections 3.2 and 3.3 and Chapters 4, 5, 6, 7, and 8 are essentially 
independent, though Section 7.3 depends on Chapter 5 and parts of 
Chapter 8 depend on earlier material in the beginning of Chapter 7 and in 
Section 3.2. Even Sections 6.1 and 6.2 are essentially independent, with the 
exception that Section 6.2.4 depends on Section 6.1. 

There are so many topics to be covered in the growing field of measure- 
ment theory that it is impossible to survey them all. However, references 
have been provided at the end of each chapter, and it is hoped they will 
lead the reader into the literature. 
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CHAPTER 1 
Relations 


1.1 Notation and Terminology 


In this chapter, we present a mathematical topic, the theory of relations. 
The concepts and techniques presented here will be used throughout the 
rest of the volume. The notation we shall use is summarized in Table 1.1, 
fundamental properties of relations are defined in Table 1.2, and types of 
relations are defined in Table 1.3. Many readers will be familar with most 
of the material of this chapter. The reader may simply want to glance at 
Tables 1.1, 1.2, and 1.3. He should, however, familiarize himself with 
Theorems 1.2, 1.3, and 1.4 of Section 1.5 and with the material of Section 
1.8. 


1.2 Definition of a Relation 


Suppose A and B are sets. The Cartesian product of A with B, denoted 
A X B, is the set of all ordered pairs (a, b) so that a is in A and b is in B. 
More generally, if A,, A,,..., A, are sets, the Cartesian product 


Aj ® Ay cs A: 


is the set of all ordered n-tuples (a,,a,,...,4,) such that a, € A,, 
a, © Ax,...,a, © A,. The notation A” denotes the Cartesian product of 
A with itself n times. 

A binary relation R on the set A is a subset of the Cartesian product 
A X A, that is, a set of ordered pairs (a, b) such that a and b are in A. If A 
is the set {1, 2, 3, 4}, then examples of binary relations on A are given by 


R = {(1, 1), (1, 2), (2, 1), G, 3), (3, 4), (4 3)}, (1.1) 
S = {(1, 1), (, 2), (2 1), (2, 2), (3, 3), (3, 4), (4 3), (4 4)}, (1.2) 
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Table 1.1. Notation 
Set-Theoretic Notation 


U union 

n intersection 

is subset (contained in) 

& proper subset 

Zz is not a subset 

2 contains (superset) 

€ member of 

€ not a member of 

@ empty set 

{...} the set ... 

{...:..-} the set of all ... such that ... 
Ac complement of A 

A-B AN BS 

|A| cardinality of A, the number of elements in A 
Logical Notation 

~ not 

=> implies 

e if and only if (equivalence) 
Vv for all 

3 there exists 

iff if and only if 

Sets of Numbers 

Re the real numbers 

Ret the positive real numbers 

Q the rational numbers 

Qt the positive rational numbers 
N the positive mtegers 

Z the integers 

Miscellaneous 

fog composition of the two functions f and g 
F(A) the image of the set A under the function f; ie., { f(a): aE A}. 
& approximately equal to 

= congruent to 

I product 

>» sum 

S integral sign 
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Table 1.2. Properties of Relations 


A Binary Relation (A, R) Is: Provided That: 

Reflexive aRa, alla € A 

Nonreflexive it is not reflexive 

Irreflexive ~aRa, alla € A 

Symmetric aRb = bRa, alla,b EG A 

Nonsymmetric it is not symmetric 

Asymmetric aRb => ~ bRa,alla,bEA 

Antisymmetric aRb & bRa > a= b,alla,bEA 

Transitive aRb & bRc => aRc, alla,b,c G A 

Nontransitive it is not transitive 

Negatively transitive ~aRb & ~ bRc => ~ aRc, alla, b,c € A; 
equivalently: xRy = xRz or zRy, 
allx,y,z EA 

Strongly complete for alla, b € A, aRb or bRa 

Complete for alla 4b € A, aRb or bRa 

Equivalence relation it is reflexive, symmetric and 
transitive 


Table 1.3. Order Relations* 


Relation Type 
Strict Strict Strict 

Quasi Weak Simple Simple Weak Partial Partial 
Property Order Order Order Order Order Order Order 
Reflexive Vv VA 
Symmetric 
Transitive Vv Vv VA Vv Vv Vv 
Asymmetric VA Vv Vv 
Antisymmetric VA Vv 
Negatively transitive Vv 
Strongly complete Vv Vv 
Complete Vv 


*A given type of relation can satisfy more of these properties than those indicated. 
Only the defining properties are indicated. 


and 
T = {(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)}. (1.3) 


The binary relation T is the “less than” relation on A; an ordered pair 
(a, b) is in the binary relation T if and only if a < b. Similarly, “less than” 
defines a binary relation on the set A of all real numbers, as does “greater 
than,” “equals,” and so on. Of course, the set A does not have to be a set 
of numbers. If A is the set {SO,, DDT, NO,}, then examples of binary 
relations on A are given by 


U = {(SO,, DDT), (DDT, NO,), (SO,, NO,)} (1.4) 
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and 
V = {(SO,, NO,), (NO,, SO,)}. (1.5) 


In the case of a binary relation R on a set A, we Shall usually write aRb to 
denote the statement that (a,b) € R. Thus, for example, if S is the 
relation defined by Eq. (1.2), then 353 and 251, but not 3S1. If U is the 
relation defined by Eq. (1.4), then SO,UDDT. 

Binary relations arise very frequently from everyday language. For 
example, if A is the set of people in the world, then the set 


F = {(a,b):a € A and b € A and ais the father of b} (1.6) 


defines a binary relation on A, which we may call, by a slight abuse of 
language, “father of.” To give another example, suppose A is any collec- 
tion of alternatives among which you are choosing, for example, a collec- 
tion of designs for a regional transportation system, and suppose 


P = {(a, b) € A X A: you (strictly) prefer a to 5}. (1.7) 


Then P may be called your relation of “strict preference.”* Someone else’s 
relation of preference might be quite different, and that of course is where 
problems arise. To give yet another example, suppose A is a set of airplane 
engines and L is the relation “sounds louder than when heard at a 
horizontal distance of 500 feet.” The relation L on A can play a role in the 
design of airplanes. It is hoped that the relation L is related to some 
physical characteristics of the engine design and of the sounds engines 
emit. The study of the relationship between the physical properties of these 
sounds and the psychological ones, such as perceived loudness, is the 
subject matter of the field called psychophysics. (We return to this study in 
Chapter 4.) 

The properties of a relation are not clearly defined without giving its 
underlying set. Thus, if A is the set of all people in the United States and B 
is the set of all males in the United States, then the relation 


R = {(a,b) € A X A: ais the brother of 5} (1.8) 
is different from the relation 


R’ = {(a, b) € B X B: ais the brother of b}. (1.9) 


For example, R’ has certain symmetry properties; that is, if aR’b, then 
bR'a. These properties are not shared by R. Moreover, R’ has different 
properties if it is thought of as a relation on the set B or as a relation on 
the set A (even though R’ contains no ordered pairs with elements not in 
B). To make sure that the set A on which a relation R is defined is given 
explicitly, it is necessary to speak formally of a relational system (A,R) 


*Strict preference is to be distinguished from weak preference: the former means “better 
than,” and the latter “at least as good as.” See Section 1.5. 
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rather than just a relation R. By an abuse of language, we shall simply call 
(A,R) a relation. We shall see more precisely below why specification of 
the underlying set is important. In general, if B& A and R is a binary 
relation on A, we shall refer to 


S = {(a, b) © B X B: aRb} 


as the restriction of R to B or the subrelation generated by B. Thus the 
relation (B, R’) defined by Eq. (1.9) is the restriction of the relation (A, R) 
of Eq. (1.8) to the set B of all males in the United States. 

We may also speak of n-ary relations, where 7 is a positive integer. An 
n-ary relation R on a set A is a subset of the Cartesian product A”. (We 
shall frequently speak of a relation, rather than an n-ary relation, if n is 
understood.) For example, if A = {1, 2, 6}, then a 3-ary or fernary relation 
R on A is given by 


R= {(1, 2, 6), (6, 2, 1), (6, 6, 6)}. (1.10) 


If A is the set of all lines in the plane, we might define a ternary relation R 
on A as follows: 


(a, b,c) € R= a, b, andc are parallel and b is (1.11) 


strictly between a and c. 


A similar ternary relation R might be defined on the set A of all students 
in a given school. Then we would take 


(a, b,c) € R = b’s grade point average is strictly (1.12) 
between those of a and c. ; 


If A = {1, 2, 6}, then a 4-ary or quaternary relation R on A is given by 


R = {(I, 2, 2, 6), (1, 2, 1, 6), (6, 6, 6, 6)}. (1.13) 


If A is once again a set of alternatives such as designs for alternative 
regional transportation systems, you might make statements like S$: “I 
prefer a to b at least as much as I prefer c to d.” A quaternary relation on 
A is given by the collection D of all ordered 4-tuples (a, b, c, d) such that 
a,b,c, d are in A and for which you assert such a statement 5 of 
comparative preference. In such a case, we shall use either the notation 
D(a, b, c, d) or the notation abDcd to mean that (a, b, c, d) © D. 

In what follows, we shall most frequently deal with binary relations, and 
shall often speak just of a relation when we mean a bmary one. 


Exercises 


1. If (A, R) is a binary relation, the converse is the relation R~' on A 
defined by 


aR~'b iff bRa. 
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(The notation R is sometimes used in place of R~!.) For example, if A is 
the set of all males in the United States and (A, R) is “father of,” then 
(A, R—') is “son of.” Identify the converse of the following relations: 

(a) “Sister of” on the set of all people in the United States. 

(b) “Uncle of” on the set of all people in the Soviet Union. 

(c) “Greater than” on the set Re. 

(d) (Re, =). 

(e) “As tall as” on the set of all men in New Jersey. 

2. If (A, R) and (A, S) are binary relations, the intersection R \ S on A 
is defined by 
Rn S = {(a, 6): aRb and aSb}. 


For example, if (A, R) is “brother of” and (A, S) is “sibling of,” then 
(A, RO S) is “brother of.” Identify (A, R NM S) in the following cases: 
(a) A =a set of people, R =“father of,” S = “relative of.” 
(b) A = Re, R=2,S=#. 
(c) A =a set of people, R =“older than,” S =“father of.” 
(d) A =a set of sets, R = &, S =“are disjoint.” 
3. If (A, R) and (A, S) are binary relations, the union R U S on A is 
defined by 
RU S = {(a, 6): aRb or aSb}. 


For example, if (A, R) is “brother of” and (A, S) is “sister of,” then 
(A, R U S) is “sibling of.” Identify (A, R U S) in the examples of Exer. 2. 

4. If (A, R) and (A, S) are binary relations, then the relative product 
ROS on A is defined by 


ROS = {(a, 5): for some c, aRc and cSb}. 


For example, if (A, R) is “father of” and (A, S) is “parent of,” then 
a(R 0 S)b holds if and only if, for some c, a is father of c and c is parent 
of b—that is, if and only if a is grandfather of b. Identify Ro S in the 
following examples: 
(a) A =a set of people, R =“father of,” S =“mother of.” 
(b) A =a set of people, R =“older than,” S =“older than.” 
(Cc) A=Re,R=>,S=>. 
(d) A=Re,R=>,S=<. 
5. Show the following: 
(a) (RU S)oT=(ROT)U (SOT). 
(b) (R Nn S)OT may not be (ROT) (SOT). 
(c) ROo(SOT)=(ROS) oT. 
(d) (RO S)7' may not be R7'o0 S7!. 
6. Show that (A, So R) can be different from (A, ROS). 
7. Suppose A={1, 2, 6} and (A, R) is the quaternary relation of Eq. 
(1.13). Then we may define a ternary relation S on A and a binary relation 
T on A by 


(a, b,c) € S = (Ax)[ R(a, b,c, x)] 
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and aTb = (Ax)(Ay){ R(a, 5, x, y)]. 

(a) Write out (A, S) and (A, T) as sets of ordered n-tuples. 

(b) Let U be the quaternary relation defined by the restriction of R to 
B={1, 6}. Write out (B, U). 


13 Properties of Relations 


There are certain properties that are common to many naturally occur- 
ring relations. We discuss some of these properties in this section, and they 
are summarized in Table 1.2 at the beginning of the chapter. Let us say 
that a binary relation (A, R) is reflexive if, for all a © A, aRa. Thus, for 
example, if A is a set of real numbers and R is the relation “equality” on 
A, then (A, R) is reflexive because a number is always equal to itself. If 
A = {1, 2, 3,4}, then the relation (A, R) defined by Eq. (1.1) is not 
reflexive, since 2R2 does not hold. On the other hand, the restriction of 
this relation to the set B = {1, 3} is reflexive. This observation demon- 
strates again why it is important to refer to the underlying set when 
speaking of a relation. 

If A is the set of people in the world and F is the relation “father of” on 
A, then (A, F) is not reflexive, since a person is not his own father. This 
relation is not reflexive in a very strong way, since the condition of 
reflexivity is violated for every a in A. We shall say that a binary relation 
(A, R) is irreflexive if, for alla € A, ~ aRa. Thus the relation “father of” 
is irreflexive. So is the relation (A, T) where A = {1, 2,3, 4} and T is 
defined by Eq. (1.3), and the relation (A, U) where A = 
{SO,, DDT, NO,} and U is defined by Eq. (1.4). This terminology should 
be distinguished from the terminology nonreflexive, which means simply 
“not reflexive.” 

A binary relation (A, R) is called symmetric if, for all a, b € A, 

aRb = bRa. 

That is, (A, R) is symmetric if, whenever (a, b) € R, then (6, a) € R. The 
relation “equality” on any set of numbers is symmetric. So are the relations 
(A, R) where A = {1, 2, 3, 4} and R is defined by Eq. (1.1), and (A, V) 
where A = {SO,, DDT, NO,} and V is defined by Eq. (1.5). The relation 
(A, T) where A = {1, 2, 3, 4} and T is given by Eq. (1.3), is not symmetric, 
for 172 but ~ 271. The relation “brother of” on the set of all people in 
the United States is not symmetric, for if a is the brother of 6, it does not 
follow that 6 is the brother of a. However, the relation “brother of” on the 
set of all males is symmetric. 

Other examples of nonsymmetric (not symmetric) relations are the rela- 
tion “father of” on the set of people in the world, the relation P of strict 
preference on a set of alternatives, and the relation “sounds louder than” 
on a set of sounds. These three relations are highly nonsymmetric. They 
are asymmetric, i.e., they satisfy the rule 

aRb => ~ bRa. 
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Other asymmetric relations are the relation “greater than,” > , on the set 
of real numbers, the relation “strictly contained in,” G, on any collection 
of sets, and the relation (A, U) where A = {SO,, DDT, NO,} and U is 
given by Eq. (1.4). 

Some relations (A, R) are not quite asymmetric, but are almost asym- 
metric in the sense that aRb and bRa holds only if a = b. Let us say that 
(A, R) is antisymmetric if, for all a, b € A, 


aRb & bRa > a= b. 

An example of an antisymmetric relation is the relation “greater than or 
equal to,” 2 , on the set of real numbers. Another example is “contained 
in,” ©, on any collection of sets. Every asymmetric binary relation (A, R) 
is antisymmetric. But the converse is false: the relation 2 is antisymmetric 
but not asymmetric. 

A relation (A, R) is called transitive if, for all a, b, c € A, whenever aRb 
and bRc, then aRc. In symbols, (A, R) is transitive if 


aRb & bRce = aRc. 


Examples of transitive relations are the relations “equality” and “greater 
than” on the set of real numbers and “implies” on a set of statements. It is 
left to the reader to verify that the relations (A, S) where A = {1, 2, 3, 4} 
and S is defined by Eq. (1.2), and (A, U) where A = {SO,, DDT, NO,} 
and U is defined by Eq. (1.4), are transitive. It seems reasonable to assume 
that the relation of strict preference among alternative designs of trans- 
portation systems is transitive, for if you prefer a to b and b to c, you 
should be expected to prefer a to c. We shall discuss this point further in 
later chapters. Similarly, it seems reasonable to assume that the relation L, 
“sounds louder than,” on a set of airplane engines is transitive, though this 
must be left to empirical data to verify. If A = {SO,, DDT, NO,} and V is 
defined by Eq. (1.5), then (A, V) is not transitive. For SO,VNO, and 
NO, VSO,, but not SO,VSO,. Another relation that is not transitive is the 
relation “father of.” 

In studying binary relations (A, R), it will often be convenient to use the 
abbreviation aRbRc for the statement aRb & bRc. Thus, (A, R) is transi- 
tive if aRbRc implies aRc. Similarly, aRbRcRd will abbreviate aRb & bRc 
& cRd. And so on. If transitivity holds, then aRbRcRd implies aRd. More 
generally, a,Ra,Ra,...Ra, implies a,Ra,. The proof is easily accom- 
plished by mathematical induction. 

A binary relation (A, R) is called negatively transitive if, for all a, b,c € 
A, not aRb and not bRce imply not aRc. A binary relation (A, R) is 
negatively transitive if the relation “not in the relation R,” defined on the 
set A, is transitive. To give an example, the relation R =“greater than” on 
a set of real numbers is negatively transitive, for “not in R” is the relation 
“less than or equal to,” which is transitive. It is easy to show that if 
A = {SO,, DDT, NO,} and U is defined by Eq. (1.4), then (A, U) is 
negatively transitive. Similarly, strict preference on a set of alternatives and 
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Figure 1.1. The relation “contained in” is not necessarily negatively transitive. 


“sounds louder than” on a set of sounds are probably negatively transitive. 
Verifying negative transitivity can be annoyingly confusing; it is often 
easier to test the following equivalent condition: For all x, y, z € A, if 
xRy, then xRz or zRy. To prove that these two conditions are equivalent, 
we observe that the equivalent version is just the contrapositive of the 
condition in negative transitivity.* Using this notion, we see easily that the 
relation “greater than” is negatively transitive, for if x > y, then for all z, 
either x >z or z >y. Similarly, one sees that “contained in” is not 
negatively transitive, for if x is contained in y, there may very well be a z 
so that x is not contained in z and z is not contained in y. (See Fig. 1.1 for 
an example.) The relation “father of” on the set of all people in the world 
is not negatively transitive, nor is the relation (A, R) where A = 
{1, 2, 3,4} and R is given by Eq. (1.1). To see the latter, note that 1R2, 
but not 1R3 and not 3R2. 


Exercises 


1. (a) Show that if (A, R) is a binary relation and (A, R7') is its 
converse (Exer. 1, Section 1.2), then (A, R~') is reflexive if and only if 
(A, R) is. 

(b) Is a similar statement true for the property irreflexive? 
(c) Symmetric? 

(d) Asymmetric? 

(e) Antisymmetric? 

(f) Transitive? 

(g) Negatively transitive? 


2. Suppose (A, R) and (A, S) are binary relations. 
(a) If both relations (A, R) and (A, S) are reflexive, is the intersec- 
tion (A, R 1 S) reflexive? 
(b)—(g) Repeat for the properties in (b) through (g) of Exer. 1. 
3. Repeat Exer. 2 for the union (A, R U S). 
4. Repeat Exer. 2 for the relative product (A, ROS). 
5. Which of the properties in (a) through (g) of Exer. 1 hold for the 
relational system (A, 2)? 
6. Which of the properties in (a) through (g) of Exer. 1 hold for the 
relational system (A, A X A)? 


*If & is the statement A implies B, then the contrapositive of S is the statement “not B” 
implies “not A.” 
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7. (a) Show that it is not possible for a binary relation to be both 
symmetric and asymmetric. 
(b) Show that it is possible for a binary relation to be both symmet- 
ric and antisymmetric. 


8. Show that there are binary relations that are 
(a) transitive but not negatively transitive; 
(b) negatively transitive but not transitive; 

(c) neither negatively transitive nor transitive; 
(d) both negatively transitive and transitive. 


9. (a) Consider the relation x divides y on the set of positive integers. 
Which of the properties in (a) through (g) of Exer. 1 does this relation 
have? 

(b) Repeat part (a) for the relation “uncle of” on a set of people. 
(c) For the relation of “having the same weight as” on a set of mice. 
(d) For the relation “feels smoother than” on a set of objects. 

(e) For the relation “admires” on a set of people. 


10. Suppose A = Re and 
aRb @a>btl. 


This relation will arise in our study of preference in Chapter 6. Which of 
the properties in (a) through (g) of Exer. 1 hold for (A, R)? 


11. Consider the binary relation (A,S) where A = Re and 
aSb <= |a — b| S 1. 


This relation is closely related to the binary relation (A, R) of Exer. 10, 
and will arise in our study of indifference in Chapter 6. Which of the 
properties in (a) through (g) of Exer. 1 hold for (A, S)? 


12. If (A, R) is a binary relation, the symmetric complement is the binary 
relation S on A defined by 


aSb <= (~aRb & ~bRa). 


Note that if R is strict preference, then S is indifference: you are indiffe- 
rent between two alternatives if and only if you prefer neither. 
(a) Show that the symmetric complement is always symmetric. 
(b) Show that if (A,R) is negatively transitive, then the symmetric 
complement is transitive. 
(c) Show that the converse of (b) is false. 
(d) Show that if A= Re and R is as defined in Exer. 10, then S as 
defined in Exer. 11 is the symmetric complement. 
(e) Identify the symmetric complement of the following relations: 
(i) (Re, >). 
(ii) (Re, =). 
(ili) (NV, R), where xRy means that x does not divide y. 
13. The data of Table 1.4 shows the (consensus) preferences among 
composers of the members of an orchestra. The /,/ entry is 1 if and only if 
composer j is (strictly) preferred to composer 7. Which of the properties in 
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(a) through (g) of Exer. 1 hold for the orchestra members’ relation of 
preference on the set 


A={Beethoven, Brahms, Mozart, Wagner}? 


Table 1.4. Preferences of Orchestra Members Among Composers* 
(The i,j entry is | iff composer i is (strictly) preferred to composer j.) 


Beethoven Brahms Mozart Wagner 
Beethoven 0 1 1 1 
Brahms 0 0 1 1 
Mozart 0 0 0 1 
Wagner 0 0 0 0 


*Data based on an experiment of Folgmann [1933], with strict preference for the 
orchestra taken to mean a majority of the orchestra members have that preference. 


14. The data of Table 1.5 represent taste preferences for vanilla pud- 
dings by a group of judges, with entry i,j equal to 1 if and only if the 
group (strictly) prefers pudding i to pudding j. Which of the properties in 
(a) through (g) of Exer. | hold for this relation of preference on the set 
{1, 2, 3, 4, 5}? 


Table 1.5. Taste Preference for Vanilla Puddings* 
(Entry i,j is 1 iff pudding i is (strictly) preferred 
to pudding j by a group of judges.) 


2 3 4 
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*Data obtained from an experiment of Davidson 
and Bradley [1969]. 


15. The data of Table 1.6 state judgments of relative loudness of 
different sounds, with entry i,j taken to be 1 if and only if sound i is 
judged (definitely) louder than sound j. Which of the properties in (a) 
through (g) of Exer. | hold for the relation “louder than” on the set of 
sounds {1, 2, 3, 4, 5}? 


Table 1.6. Judgments of Relative Loudness. 
(Entry i,j is 1 iff sound i is judged 
(definitely) louder than sound /.) 


1 3 4 
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16. The data of Table 1.7 present the judgments of “sameness” among 
different sounds by individuals in a group, with entry i,j taken to be 1 if 
and only if sound i is judged to be the same as sound j a sufficiently large 
percentage of the time. Which of the properties in (a) through (g) of Exer. 
1 hold for this relation of sameness on the set {B, C, F, J}? 


Table 1.7. Judgments of Sameness of Sounds* 
(Entry i,j is taken to be 1 iff sound 7 
is judged to be the same as sound / 
at least 25% of the time.) 


B Cc F 
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*Data from an experiment of Rothkopf [1957]. 


17. The data of Table 1.8 present judgments of relative importance of 
different objectives for a library system in Dallas, with entry i, j equal to | 
if and only if objective i is judged more important than objective.,j. Which 
of the properties in (a) through (g) of Exer. 1 hold for this relation of 
relative importance on the set {a, b, c, d, e, f}? 


Table 1.8. Judgments of Relative Importance of Objectives for a 
Library System in Dallas* 
(Entry i,j is 1 iff i is judged more important than /.) 


b d 
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*Data from Farris [1975]. 


Key 

Convenient, accessible library facilities 
Convenient operating hours 

Efficient inter-library system 

Good local libraries 

Good reproduction facilities 

Rapid inter-library response time 


mo acdca » 


18. The data of Table 1.9 present judgments of relative importance of 
different objectives for a state environmental agency in Ohio, with entry 
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Table 19. Judgments of Relative Importance of Goals for a 
State Environmental Agency in Ohio* 
(Entry i,j is | iff i is judged more important than /.) 


b c d e 
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*Data from Hart [1975]. 


Key 


a Enhance and protect State’s environment 

b_ Improve and insure the quality of the air 

c Develop a comprehensive program of environmental quality 
planning 

d_ Protect and promote the State’s natural attractions 

e Prevent the future occurrence of pollution emergencies 

f Promote an environment that is beneficial to human health 
and welfare 


i,j taken to be 1 if and only if objective i is judged more important than 
objective j. Which of the properties in (a) through (g) of Exer. 1 hold for this 
relation of relative importance on the set {a, b, c, d, e, f}? 


1.4 Equivalence Relations 


Many binary relations satisfy the three properties reflexivity, symmetry, 
and transitivity. Such relations are called equivalence relations. The relation 
equality on any set of numbers is an equivalence relation. So is the relation 
(A, S), where A is {1, 2, 3, 4} and S is defined by Eq. (1.2). If A is a set of 
lines and aRb holds if and only if a and 5b are parallel, then (A, R) is an 
equivalence relation provided that we say a line is parallel to itself. Other 
examples are the following: A is the set of integers {0, 1, 2,..., 26}, and 
aRb iff a = b(mod 3); A is a set of people who have been blood-typed, and 
aRb iff a and b have the same blood type; A is a set of people, and aRb iff 
a has the same height as 5; A is a set of children, and aRb iff a and b have 
the same IQ; A is a set of animals, and aRb iff a and b are in the same 
species. 

If (A, R) is an equivalence relation and a € A, let a* denote 


{b € A: aRb}. 


This set will be called the equivalence class containing a. The element a will 
be called a representative of the equivalence class a*. To give an example, 
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if A = {1, 2, 3, 4} and S is defined by Eq. (1.2), then 


1* = {1, 2}, 
2* = {1,2}, 
3* = (3, 4}, 
4* = (3, 4}. 


Here, there are two different equivalence classes, {1,2} and {3, 4}. If 
= {0, 1, 2,..., 26} and @aRb iff a = b (mod 3), then 


0* = {0, 3, 6, 9, 12, 15, 18, 21, 24}, 
1* = {1, 4, 7, 10, 13, 16, 19, 22, 25}, 
2* = {2, 5, 8, 11, 14, 17, 20, 23, 26}, 
3* = {0, 3, 6, 9, 12, 15, 18, 21, 24}, 


Here, there are three different equivalence classes, 0*, 1*, and 2*. 
The most important properties of equivalence relations are summarized 
in the following theorem. 


THEOREM 1.1. Suppose (A,R) is an equivalence relation. Then: 

(a) Any two equivalence classes are either disjoint or identical. 

(b) The collection of (distinct) equivalence classes partitions A; that is, 
every element of A is in one and only one (distinct) equivalence class. 


Proof. To prove (a), we shall show that for all a,b EA, either a* = b* or 
a*( b* =. In particular, we shall show that 


aRb = a* = b* (1.14) 
and 
~ aRb > a*n b* =. (1.15) 


To demonstrate (1.14), we assume aRb and show that aRc holds if and 
only if bRe holds. Thus, suppose aRc holds. Then cRa follows, since (A, R) 
is symmetric. Now we have cRa and aRb, so we conclude cRb from 
transitivity of (A,R). Finally, cRb implies bRc, by symmetry. A similar 
proof, left to the reader, shows that bRc implies aRc. Thus, we have 
established (1.14). To prove (1.15), suppose a*N b* #@ and let c Ea*- b*. 
Then c€a* implies aRc, and c€b* implies bRc. By symmetry, we have 
cRb. Finally, by transitivity, aRe and cRb imply aRb. This proves (1.15), 
and completes the proof of part (a). 

To prove part (b), note that every element a in A is in some equivalence 
class, namely a*, and in no more than one, by part (a). | 


In general, whenever (A,R) is an equivalence relation, we can gain 
much in the way of economy by dealing with the equivalence classes rather 
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than the objects of A themselves. There are in general many fewer such. 
We shall see this clearly in the next section, when we discuss the process of 
reduction. 


Exercises 


1. Suppose A is the set of sequences of 0’s and 1’s of length 10, and aRb 
holds if and only if sequences a and b have the same number of 1’s. 
(a) Show that (A, R) is an equivalence relation. 
(b) Identify the equivalence classes. 


2. (a) Show that if A is a set of sounds and aRb holds if and only if a 

and b sound equally loud, then (A, R) may not be an equivalence relation. 

(b) If A is a set of sounds, and @aRb holds if and only if a and b are 
measured to have the same decibel level (a measure of sound intensity), is 
(A, R) an equivalence relation? 

(c) If A is a set of people and aRb holds if and only if a and b look 
equally tall, is (A, R) an equivalence relation? 

(d) If A is a set of people and @Rb holds if and only if a and b are 
measured to have the same height, is (A, R) an equivalence relation? 


3. Suppose (A,S) is the binary relation of Exer. 11, Section 1.3. Is (A, S) 
an equivalence relation? 


4. Show that all the properties in the definition of an equivalence 
relation are needed. In particular, show that there are binary relations that 
are 

(a) reflexive, symmetric, and not transitive; 
(b) reflexive, transitive, and not symmetric; 
(c) symmetric, transitive, and not reflexive. 
5. Suppose (A, R) and (A, S) are equivalence relations. 
(a) Show that (A, R /M S) is an equivalence relation. 
(b) Show that (A, R U S) does not have to be an equivalence rela- 
tion. 
(c) What about (A, Ro S)? 


6. Show that if (A, R) is an equivalence relation, then 
a*=b* iff aeb*. 


7. If (A,R) is an equivalence relation, we may define a binary relation 
R* on the set A* of equivalence classes as follows: if a and £6 are 
equivalence classes, and a is in a and b is in £, then 


aR*B iff aRb. 


(a) Show that R* is well-defined in the sense that if a’ is ina and b’ is 
in B, then 
aRb iff a@Rb’. 


(b) Moreover, show that (A*, R*) is an equivalence relation. 


8. Suppose A is finite and R is a binary relation on A. Let R? be the 
binary relation R 0 R on A, and define R" to be R”'oR. 


28 Relations 1.5 


(a) Show that if n21, then aR"b if and ed if there are 
a,,4,, ...,4,_, So that aRa,, a,Ra,,..., a,-2Ra,_ _,Rb. 

(b) Define R° on A to be {(a, a): a € "A}. If aR" for some n 2 0, 
we say that there is a (finite) path from a to b of length n. If A is finite, 
show that there is a number & so that if there is a path from a to b, there is 
one of length at most k. 

(c) Let S be defined on A as follows: aSb if and only if there is a 
path from a to b. Show that (A, S) is transitive. 

(d) Let T be defined on A as follows: T= S 7 S~'. Show that aTb 
if and only if there is a path from a to b and a path from 5 toa. 

(ce) Show that (A, T) is an equivalence relation. (The equivalence 
classes are called strong components.) 

(f) (A, R) is called strongly connected if there is just one equivalence 
class under T. Is the relation < on a finite set of numbers strongly 
connected? 

(g) If A is finite, the transitive closure of (A, R) is the transitive 
relation on A containing all ordered pairs in R and the smallest possible 
number of ordered pairs. Show that the notion of transitive closure is 
well-defined. 

(h) How is transitive closure related to the relation S defined in (c)? 

(i) Identify R, R°, S, T, the strong components, and the transitive 
closure in the following examples. Determine which of these examples is 
strongly connected. 


(i) A = Re,R=2. 
(i) A = (1, 2, 3, 4}, R = {C, 2), (2, 3), GB, 4)}. 
(ii) A = (1, 2, 3), R = (C1, 2), 2 D, (1, 9}. 


}; 
(iv) A = {+, *, #,$}, R = {((+, #), (, *), C+, *)}. 
9 (a) Does the data of Table 1.4 define an equivalence relation? 
(b) What about the data of Table 1.5? 
(c) Table 1.6? 
(d) Table 1.7? 
(e) Table 1.8? 
(f) Table 1.9? 


1.5 Weak Orders and Simple Orders 


In this section, we define various order relations. The results are 
summarized at the beginning of the chapter in Table 1.3, which the reader 
is urged to consult for reference. 

Suppose (A, P) is the binary relation of (strict) preference defined by Eq. 
(1.7), where A is a set of alternatives among which you are choosing—for 
example, alternative designs for a regional transportation system. In gen- 
eral, we can suppose that for each pair of alternatives a and b, you do one 
of three things: You prefer a to b, you prefer b to a, or you are indifferent 
between a and b. Let us say you weakly prefer a to b if either you (strictly) 
prefer a to b or you are indifferent between a and b. We denote the binary 
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relation of weak preference on the set A by W, and the binary relation of 
indifference on A by J. Then we have 


aWb = (aPb or alb). 


It is reasonable to assume that the relation (A, W) is both reflexive and 
transitive, though later we shall question transitivity. A binary relation that 
satisfies these two properties is called a quasi order or pre-order. The 
relation 2 on the set of real numbers is another example of a quasi order; 
so is the relation & on any collection of sets; so is the relation “at least as 
tall as”; and so is any equivalence relation. 

The relation (A, W) presumably also has the property that for every a 
and b in A, including a = b, either aWb or bWa. A binary relation with 
this property is called strongly complete (sometimes the terms connected or 
strongly connected are used). A binary relation is called a weak order if it 
is transitive and strongly complete. Thus, weak preference is a weak order. 
So is the relation 2 on the set Re of real numbers. The relation > on Re 
is not a weak order, for it is not strongly complete; it is not the case that 
1 > 1. Similarly, S is not weak, because it is not strongly complete. Any 
weak order (A, R) is a quasi order. It suffices to prove reflexivity, which 
follows by strong completeness. A quasi order is not necessarily a weak 
order. An example of a quasi order that is not weak is given by any 
equivalence relation with more than one equivalence class, as, for example, 
the relation (A, S), where A = {1, 2, 3, 4} and S is defined by Eq. (1.2). 
One of the most helpful examples of a weak order is the following, which 
we shall use as an example frequently. Let A = {0, 1,2, ..., 26}. Every 
number a in A is congruent (mod 3) to one of the numbers 0, 1, or 2. Let 
us call this number a mod 3. Thus, 8 mod 3 is 2, 10 mod 3 is 1, etc. We 
define R on A as follows: 


aRb = a mod 3 2 b mod 3. (1.16) 


Thus, 2R1, 8R 10, etc. It is left to the reader to prove that (A, R) is a weak 
order. We may think of (A,R) as follows: Elements of A are listed in 
vertical columns above the number 0, 1, or 2 to which they are congruent. 
Then aRb if and only if a is at least as far to the right as b. (See Fig. 1.2.) It 
will follow from Theorem 1.2 below that essentially every weak order on a 
finite set can be thought of as an ordering “weakly to the right of” on a 
comparable array of points arranged in vertical columns. A sample array is 
shown in Fig. 1.3. 

A weak order (A,R) that is also antisymmetric is called a simple order. 
(The terms linear order and total order are also used.) The prototype of 
simple orders is the relation 2 . In a simple order R on a finite set A, the 
elements of A may be laid out on the line with aRb holding if and only if a 
is to the right of b or equal with b, as shown in Fig. 1.4. (We prove this 
formally in Section 3.1.) 
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Figure 1.2. The binary relation 2 (mod 3) on A = (0, 1, 2,..., 26}. 


Figure 1.3. A weak order (A,R); aRb iff a is to the right of 5 or in the same vertical column 
as b. 


Figure 1.4. A simple order. 


If (A, R) is a binary relation, suppose we define a binary relation E on A 
by 


aEb = aRb & bRa. (1.17) 


If we think of an array hke that of Fig. 1.3, the relation E can be 
interpreted as “being in the same vertical column.” If (A, R) is a simple 
order, then antisymmetry implies that E is equality. More generally, if 
(A,R) is a quasi order, then E is an equivalence relation. To see this, 
suppose (A, R) is a quasi order. We verify first that (A, £) is symmetric. If 
aEb holds, then aRb and bRa hold; hence bRa and aRb hold; hence bEa 
holds. (Notice that this proof does not use any of the properties of a quasi 
order.) Proof that (A, E) is reflexive and transitive is left to the reader. 

If (A,R) is a weak order, then it is a quasi order, and so E is an 
equivalence relation. The relation R tells how to simply order the equiva- 
lence classes under E. To be precise, let A* be the collection of equivalence 
classes, and define R* on A* as follows: If a and B are equivalence classes, 
pick a Ga and b E 8, and let aR*B hold if and only if aRb holds. Of 
course, this process may lead to ambiguities, because whether we take 
aR*B may depend on which a and b we chose. We shall show below that 
this is not the case, that is, that R* is well-defined. Assuming this for now, 
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we can summarize the definition of R* as follows: 
a* R*b* = aRb. (1.18) 


Then (A* R*) is called the reduction or quotient of (A, R).t 

To give an example, suppose A = {0, 1, 2, ..., 26} and aRb holds iff 
a mod 3 2 b mod 3. Then a£b holds iff a =b (mod 3). There are three 
equivalence classes: 0*, 1*, and 2*. We have 2*R*1*, since 2R1; and 
similarly 2*R*0*, 2*R*2*, etc. We see that here the reduction (A* R*) isa 
simple order on the set A* of equivalence classes; it is like the usual simple 
order 2 on {0,1,2}. That this example is not a special case is 
summarized in the following theorem. 


THEOREM 1.2. Suppose (A, R) is a weak order. Then the reduction (A* R*) 
is well-defined, and it is a simple order. 


Proof. To show that R* is well-defined, we need to show that its 
definition does not depend on the particular choice of elements a€ a and 
bEB of the equivalence classes a and £. Put in other words, if we choose 
c€a and dE, then we need to show that aRb iff cRd. To show this, 
suppose aRb. Since a and c are in a, we conclude that aEc. Similarly, bEd. 
Now aEc implies aRc and cRa, and bEd implies bRd and dRb. Now, if 
aRb, then using cRa, we conclude cRb by transitivity. From cRb and bRd 
we conclude cRd, again by transitivity. The proof that cRd implies aRb is 
analogous. Thus, R* is well-defined. 

To prove that (A*R*) is a simple order, one must verify that it is 
transitive, strongly complete, and antisymmetric. To show that it is transi- 
tive, suppose that a, 8, and y are in A* and that aR* BR*y. To show aR*y, 
pick a in a and c in y and show aRc. Now given bE £, we have aRb, since 
aR*B, and we have bRc, since BR*y. Since (A,R) is transitive, aRc 
follows. This implies aR*y. The rest of the proof is left to the reader. | 


COROLLARY The reduction is well-defined even if (A,R) is only a quasi 
order. 


Proof. The proof of well-definedness uses only this fact. | 


If we accept the fact that every simple order on a finite set can be 
realized as the ordering 2 on a set of real numbers, then we may intepret 
Theorem 1.2 as follows: Every weak order (A,R) on a finite set A may be 


{(A$R*) is sometimes denoted (4/£,R/E), and the process of reduction is sometimes 
called “canceling out the equivalence relation.” This is similar to what we do in group theory 
when we pass from a group to its cosets. 
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realized on the real line with equivalent elements in the same vertical 
column and so that aRb holds if and only if a is to the right of b or equal 
with b (cf. Fig. 1.3). 

The relation > on the real numbers is not a simple order. It is not 
strongly complete, because it is not reflexive. But it is complete in the sense 
that for all a#b, aRb or bRa. The relation (Re, >) has many properties, 
among them transitivity, completeness, asymmetry, antisymmetry, and 
negative transitivity. We would like to characterize this ordering > just as 
we characterized the ordering 2 by listing properties or axioms that 
essentially determined it—namely, the axioms for a simple order. In 
axiom-building, one tries to be as frugal as possible, and list only those 
properties that are needed. Of the above list, some are superfluous. For 
asymmetry implies antisymmetry. Similarly, transitivity and completeness 
imply negative transitivity. (The proofs are left to the reader.) These 
observations lead us to adopt the following definition: A binary relation 
(A,R) is called a strict simple order if it is asymmetric, transitive, and 
complete. (It is left to the reader to show that none of the conditions in this 
definition is superfluous.) It is often assumed that strict preference is a 
strict simple order, though it probably violates at least completeness: We 
can be indifferent between two alternatives. The relation of strict contain- 
ment & is not a strict simple order, because it violates completeness. The 
relation “beats” in a round-robin tournament usually does not determine a 
strict simple order, for transitivity is violated. Naturally, > on the set of 
reals is a strict simple order. We shall prove in Section 3.1 that in every 
strict simple order R on a finite set A, the elements of A may be laid out 
on the line with aRb holding if and only if a is strictly to the right of b. 
Thus, > is indeed the prototype of strict simple orders. It is also easy to 
show, and we shall ask the reader to show it, that the strict simple orders 
and simple orders are related to each other in the same way that > is 
related to 2 . Namely, suppose (A, R) is an irreflexive relation and S is 


defined on A by 
aSb <= aRb or a= b. 


Then (A, R) is a strict simple order if and only if (A,S) is a simple order. 

To complete our discussion of order relations, we would like to define a 
type of order relation, called strict weak, which corresponds to strict simple 
orders in the same way that weak orders correspond to simple orders. The 
paradigm example will again be an ordering of an array of vertical 
columns, but now an element a is in the relation R to an element b if and 
only if a is strictly to the right of b. An example can be defined as follows: 
Let A = {0, 1,2, ..., 26}, and let aRb hold if and only if a mod 3 > 6b 
mod 3. Such a relation will clearly be asymmetric and transitive, but no 
longer complete: of two elements a and b in the same vertical column, 
neither aRb nor bRa. Two elements a and b are in the same vertical 
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column if and only if ~ aRb and ~ bRa, and this suggests that we should 
study the following binary relation: 


aEb <= ~ aRb and ~ bRa.* (1.19) 


If (A, R) is strict simple, then (A, EZ) is equality. In general, we would like 
(A, E) to be an equivalence relation. We could define (A, R) to be a strict 
weak order if it is asymmetric and transitive and (A, E) is an equivalence 
relation. (The reader might wish to check that our relation > (mod 3) on 
{0, 1, 2, ..., 26} satisfies these properties.) However, there is something 
unsatisfactory about this definition. Specifically, all our definitions of 
order relations (A, R) so far have been stated in terms of properties of 
(A, R), and not in terms of properties of relations like (A, E) which are 
defined from (A, R). Any definition of strict weak order should turn out to 
be equivalent to this potential definition, and that will be the case with the 
definition we adopt. We shall say that (A, R) is by definition a strict weak 
order if (A, R) is asymmetric and negatively transitive. To justify this 
definition, we prove the following theorem. 


THEOREM 1.3. (A, R) is a strict weak order if and only if 
(a) (A, R) is asymmetric, 
and 
(b) (A, R) is transitive, 
and 
(c) (A, E) is an equivalence relation, where E is defined by Eq. (1.19). 


Proof. Assume (A, R) is strict weak. Then it is asymmetric by definition. 
To show that it is transitive, suppose aRb and bRc. To show aRc, suppose 
by way of contradiction that ~aRc. By asymmetry, bRc implies ~cRb. 
Now by negative transitivity, ~aRc and ~cRb imply ~aRb, which is a 
contradiction. It is left to the reader to prove that if (A,R) is strict weak, 
then (A, £) is an equivalence relation. 

To complete the proof of Theorem 1.3, let us assume that (A, R) satisfies 
conditions (a), (b), and (c). To show that (A,R) is strict weak, it is 
sufficient to show that (A,R) is negatively transitive. To demonstrate 
negative transitivity, we assume that ~aRb and ~bRc, and show ~aRc. 
We argue by cases. 


CASE 1: bRa. In this case, if aRc, we conclude bRc by transitivity. 
Thus, ~ aRc. 

CASE 2: ~ bRa. Here, there are two subcases. 

CASE 2a: cRb. In this case, if aRc, we conclude aRb by transitivity. 
Thus, ~ aRc. 


*In Exer. 12 of Section 1.3, (A,£) is called the symmetric complement of (A, R). 
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CASE 2b: ~ cRb. Here, we have aEb and bEc, since ~ aRb, ~ bRa, 
~ bRc, and ~ cRb. Since (A, E) is an equivalence relation, we conclude 
aEc, from which ~ aRc follows. | 


Even if indifference is allowed, it is often assumed that the relation of 
strict preference is an example of a strict weak order, though later we shall 
question this assumption. Another example is the binary relation “weighs 
more than” on a set of people, if weight is measured on a precise scale. A 
third example is the binary relation “warmer than” on a set of objects, if 
warmer than is based on temperature and temperature is measured on a 
precise scale. We shall return to these examples in Chapter 3. 

The process of reduction is the same for strict weak orders as it is for 
weak orders. If (A, R) is a strict weak order, let the equivalence relation E 
be defined by Eq. (1.19) and let A* be the collection of equivalence classes 
under £. Then define the reduction R* on A* as before, by 


a*R*b* a aRb. 


THEOREM 1.4 Suppose (A, R) is a strict weak order. Then the reduction 
(A* R*) is well-defined and it is a strict simple order. 


Proof. The proof is left to the reader. | 


Exercises 


1. (a) Is the converse R~! of a strict weak order R necessarily strict 
weak? 
(b) Is the converse of a weak order necessarily weak? 
(c) Is the converse of a strict simple order necessarily strict simple? 
(d) Is the converse of a simple order necessarily simple? 
2. Is every quasi order a simple order? 
3. Is every strict weak order strict simple? 


4. Let A = {(a, b): a,b € {1, 2, 3, 4}}, and suppose 

(a,b)R(c,d) iff a>ce. 
Show that (4, R) is a strict weak order and calculate the reduction 
(A* R*). 


5. Suppose A = {a, b, x,y, a, 8, y}, and R consists of the following 
ordered pairs: 


(a, a), (b, 5), (x, x), (v, ¥), (a, &),(B, B).(y, ¥), 
(y, x),(x, y), (x, a), (x, 5), (y, a), Cv, 5), (a, B), 
(a, y),(B, @),(B, v).(7, @), (7, B),(a, x), (a, ¥), 
(a, a),(a, 5), (B, x),CB, y).CB, a), B, 5), Cy, x); 
(y; ¥), Cy, a), (y, 5). 


Note that (A, R) is a weak order and calculate the reduction. 
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6. Suppose A = Re X Re, and suppose P is defined on A by 
(a, b)P(s,t) iff a2s&b2t&(a>s or b>?2). 


Show that (A, P) is not a strict weak order or a weak order. 
7. Suppose A = Re X Re, and suppose P is defined on A by 


(a,b)P(s,t) iff a>s or (a=s and b> 2). 


(a) Show that (A,P) is a strict weak order—it is called the lexico- 
graphic ordering of the plane. 
(b) Is (A, P) strict simple? 
(c) Is (A, P) weak? 
8. Suppose (A, R) is the binary relation of Exer. 10, Section 1.3. 
(a) Is (A,R) a strict weak order? 
(b) A weak order? 


9. Suppose A is a set of students and aRb holds if and only if a has a 
higher grade point average than b, or a and b have the same grade point 
average and a has a smaller number of absences. 

(a) Is (A, R) strict weak? 
(b) If so, what is its reduction? 
10. Suppose (A, R) and (A, S) are weak orders. 
(a) Is (A, R M S) necessarily weak? 
(b) What about (A, R U S)? 
(c) What about (A, Ro S)? 


11. Show that the axioms for a simple order are all needed by giving 
examples of binary relations that are 
(a) transitive, antisymmetric, and not strongly complete; 
(b) transitive, strongly complete, and not antisymmetric; 
(c) antisymmetric, strongly complete, and not transitive. 
12. Show that the axioms for a strict simple order are all needed by 
giving examples of binary relations that are 
(a) asymmetric, transitive, and not complete; 
(b) asymmetric, complete, and not transitive; 
(c) transitive, complete, and not asymmetric. 
13. Prove that transitivity and completeness imply negative transitivity. 
14, Prove that asymmetry implies antisymmetry. 
15. If (A, R) is a simple order, prove that (A* R*) is 
(a) strongly complete; 
(b) antisymmetric. 
16. Show that if (A, R) is a strict weak order and E is defined on A by 
Eq. (1.19), then (A, £) is an equivalence relation. 
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17. If (A,R) is a quasi order and E is defined on A by Eq. (1.17), show 
that (A, E) is 
(a) reflexive; 
(b) transitive. 
18. Prove Theorem 1.4. 


19. Suppose (A, R) is irreflexive. Define S on A by 
aSb iff (aRb or a=b). 


Show that (4, R) is strict simple if and only if (A, S) is simple. 
20. Suppose (A, R) is strict weak and E is defined on A by Eq. (1.19). 
Define S on A by 


aSb iff (aRb or aEb). 


Show that (4,5) is weak. 
21. Suppose (A, R) is weak. Define S on A by 


aSb iff (aRb & ~bRa). 


Show that (A, S) is strict weak. 

22. (a) Does the data of Table 1.4 define a strict weak order? 
(b) A weak order? 
(c) A strict simple order? 
(d) A simple order? 

23. Repeat Exer. 22 for the data of Table 1.5. 

24. Repeat Exer. 22 for the data of Table 1.6. 

25. Repeat Exer. 22 for the data of Table 1.7. 

26. Repeat Exer. 22 for the data of Table 1.8. 

27. Repeat Exer. 22 for the data of Table 1.9. 


1.6 Partial Orders 


Very often an ordering relation violates completeness, as we have seen. 
For example, the relation >(mod3) does; the relation “father of” does; 
most probably, the relation “louder than” on a set of airplane engines 
does, since very possibly two different engines will sound equally loud; 
and the relation £ of containment on a collection of sets does. The latter 
relation has the properties of reflexivity and transitivity and also antisym- 
metry: X& Y and YE X implies X= Y. A binary relation satisfying these 
three properties is called a partial order. Thus, a partial order is an 
antisymmetric quasi order. Each simple order is a partial order, but not 
conversely. The relations “father of’ and “louder than” are not partial 
orders, since they are not reflexive. 
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Partial orders often arise if we are stating preferences among alternatives 
that have several aspects or dimensions. Thus, suppose A is a set of 
alternative designs for a transportation system. Suppose we judge these 
designs on the basis of two aspects: cost and number of people served. Let 
us suppose we (weakly) prefer design a to design b if and only if a costs no 
more than b and a serves at least as many people as b. Let us denote this 
weak preference relation as W. Then (A, W) is certainly not complete: if a 
costs less than b and serves fewer people, then ~aWb and ~bWa. On the 
other hand, (A, W) is transitive and, if we assume that no two designs both 
cost the same and serve the same number of people, it is also antisymmet- 
ric. Thus, (A, W) is a partial order. 

More generally, suppose A is a collection of alternatives each of which 
has n dimensions or aspects, and suppose the real number f(x) measures 
the “worth” of x on the ith aspect. It is not unreasonable to define a 
(weak) preference relation W on A by 


aWb iff [f(a)2 f(b) for eachi]. 


The binary relation (A, W) defines a partial order. 

To give yet another example of a partial order, let A be a set of points in 
the plane, some of which are joined by straight lines, as in the diagram of 
Fig. 1.5. If a,b € A, let aRb hold if and only if a= 5 or there is a 
continually descending path from a to 5, following ines of the diagram. 
For example, in Fig. 1.5, we have 1R2, 1R3, 1R4, 1R5, 2R4, 3R5, and 
aRa for every a. It is fairly easy to prove that (A, R) is a partial order. 
(Proof is left to the reader.) What is not so obvious is that every partial 
order R on a finite set A arises from such a diagram, called a Hasse 
diagram of the partial order. To give an example, let us consider the partial 
order & on A = the set of subsets of {1, 2,3}. A Hasse diagram corre- 
sponding to (A,¢) is shown in Fig. 1.6. The reader will note that the 
Hasse diagram of a simple order is a “chain,” as shown in Fig. 1.7. Usually 
it is convenient in Hasse diagrams to omit a line from point a to point b if 
there is a continuously descending path of more than one link from a to b. 
We shall follow this procedure in our diagrams. 


1 


4 5 


Figure 1.5. Hasse diagram of the partial order (A, R) defined by A = {1, 2, 3, 4, 5}, 
R = (C1, )), (2, 2), G, 3), 4 4), G, 5), C1, 2), C1, 3), C1, 4), C1, 5), 2, 4), GB, 5)}- 
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{1,2,3} 
{1,2} {2,3} 
{1} be < {3} 
2) 


Figure 1.6. Hasse diagram of the partial order of inclusion & on the set of subsets of 
{1, 2, 3}. 


Figure 1.7. Hasse diagram of a simple order. 


Analogously to strict simple orders, we can speak of strict partial orders. 
A binary relation is a strict partial order if it is asymmetric and transitive. 
Partial orders and strict partial orders are related in the same way that 
simple orders and strict simple orders are. Namely, suppose (A, R) is an 
itreflexive binary relation and we define 


aSb iff (aRb or a= b). 


Then (A,R) isa Strict partial order if and only if (A,S) is a partial order. 
Again, a strict partial order R corresponds to a Hasse diagram, except that 
now an element is not in the relation R to itself. 


Exercises 

1. Suppose A = {1, 2, 3, 4} and 

R = {(, 1), (2, 2), (3, 3), (4, 4), (1, 3), C1, 4), (2, 3), (2, 4), (3, 4)}. 
Show that 


(a) (A, R) is a partial order. 
(b) Draw the Hasse diagram. 
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2. Which of the following are strict partial orders? 
(a) & on the collection of subsets of {1,2,3,4}. 
(b) (A, P) where A = Re X Re and 


(a,b) P(s,t) iff (a>s and b>?). 


(c) (A, Q) where A is a set of n-dimensional alternatives, f,, f5, 
...,J, are real-valued scales on A, and @Q is defined by 


aQb = [ f(a) > f(b) for each i]. 
(d) The relation (A, Q) where A is as in part (c) and 
aQb = | f(a) 2 f(b) for each i and f(a) > f(b) for some i]. 


(e) (A,R) of Exer. 10, Section 1.3. 
3. (a) Is the converse R~' of a strict partial order necessarily a strict 
partial order? 
(b) Is the converse of a partial order necessarily a partial order? 
4. (a) Is every weak order a partial order? 
(b) Is every partial order a2 weak order? 


5. Is every quasi order a partial order? 

6. Show that there are quasi orders that are not partial orders, not 
weak orders, and not equivalence relations. 

7. Show that every strict weak order is a strict partial order. 

8. Draw the Hasse diagram of the general strict weak order. 

9. If (A, R) is strict weak, define S on A by 


aSb iff (aRb or a=b). 


Show that (A,S) is a partial order. 


10. Show that none of the axioms for a partial order is superfluous by 
giving examples of binary relations that are 
(a) reflexive, transitive, and not antisymmetric; 
(b) reflexive, antisymmetric, and not transitive; 
(c) transitive, antisymmetric, and not reflexive. 


11. Suppose (A, R) is irreflexive, and define S on A as in Exer. 9. Show 
that (A, R) is a strict partial order if and only if (A, S) is a partial order. 


12. Exercises 12 through 18 introduce the notion of dimension of a strict 
partial order. For further reference on this subject, see Baker, Fishburn, 
and Roberts [1971] or Trotter [1978]. Also, see Exer. 2 of Section 5.1 and 
Exers. 21 and 31 of Section 6.1. Suppose (A, P) is a strict partial order. A 
strict simple order R on A such that P € Ris called a strict simple extension 
of (A, P). For example, the strict partial order defined from the Hasse 
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diagram of Fig. 1.5 has as one strict simple extension the ordering in which 
1 comes first, 2 second, 3 third, 4 fourth, and 5 fifth. List all the strict 
simple extensions of this strict partial order. 


13. Szpilrajn’s Extension Theorem [1930] states that if (A, P) is a strict 
partial order, if a#b, and if ~aPb and ~bPa, then there is a strict simple 
extension (A,R) of (A,P) so that aRb. Show from this that there is a 
family ¥ of strict simple extensions so that 


P=N{R: RE §$}. 


14. Dushnik and Miller [1941] define the dimension of a strict partial 
order (A,P) as the smallest cardinal number m so that (A,P) is the 
intersection of m strict simple extensions. (By Exer. 13, dimension is 
well-defined.) As an example, the strict partial order (A, P) defined from 
Fig. 1.5 has dimension 2. Show this by observing that (A, P) is not strict 
simple and writing (A,P) as the intersection of two strict simple exten- 
sions. 

15. If A is the set of all subsets of {1, 2,3} and P is the strict partial 
order &, show that (A, P) has dimension 3. (Komm [1948] proves that the 
strict partial order & on the set of subsets of a set S has dimension |S|.) 

16. Show that every strict weak order has dimension at most 2. 

17. Hiraguchi [1955] shows that if (A, P) is a strict partial order with |A| 
finite and at least 4, then (A, P) has dimension at most |A|/2. For a simple 
proof of this result, see Trotter [1975]. Show that dimension can be less 
than [|A|/2], where [a] is the greatest integer less than or equal to a. 

18. (a) Use Hiraguchi’s Theorem (Exer. 17) to obtain upper bounds for 
dimension of the strict partial orders whose Hasse diagrams are shown in 
Fig. 1.8. 

(b) Use Komm’s Theorem (Exer. 15) in one case and a specific 
construction in the other case to determine the exact dimensions. 


19. (a) Does the data of Table 1.4 define a partial order? 
(b) A strict partial order? 


a e c 
f b d 
(a) (b) 


Figure 1.8. Hasse diagrams of two strict partial orders. 
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20. Repeat Exer. 19 for the data of Table 1.5. 
21. Repeat Exer. 19 for the data of Table 1.6. 
22. Repeat Exer. 19 for the data of Table 1.7. 
23. Repeat Exer. 19 for the data of Table 1.8. 
24, Repeat Exer. 19 for the data of Table 1.9. 


1.7 Functions and Operations 


Suppose A is a set. A function f:A—A can be thought of as a binary 
relation (A, R) with the following properties: 


(Wa € A)(3b € A)(aRd), (1.20) 

(Wa, b,c € A)(aRb & aRc > b = c). (1.21) 

Conversely, any binary relation (A, R) satisfying (1.20) and (1.21) can be 

thought of as a function from A into A. More generally, a function 

f:A"—A can be thought of as an (n+1)-ary relation (A,R) satisfying 

properties analogous to (1.20) and (1.21). For example, if n=2, these 

properties are as follows: 

(Wa, b € A)(Ac € A)[(a, b,c) € R], (1.22) 

(Wa, b,c,d € A)[(a, b,c) E R&(a,b,d) EG R=>c =d], (1.23) 

Such functions from A XA into A are sometimes called binary operations, 

or just operations for short. Abstractly, a ternary relation defines a binary 
operation if and only if it satisfies Eqs. (1.22) and (1.23). 

Let us give some examples. Consider the operation + of addition of real 

numbers. Given a pair of real numbers a and 5, + assigns a third real 


number c so that c=a+b. The corresponding ternary relation ® on Re is 
defined as follows: 


(a,b,c)E @ iff c=atb. 
Thus, (1, 2, 3) € @ and (1, 3, 4) € ®, but (2, 5, 3) € ®. The operation 


X of multiplication corresponds to the ternary relation ® on Re defined 
as follows: 

(a,b,c)E€ @ iff c=axb. 
To give yet another example, suppose A = Re and 


0(a, b,c) c = a/b. 


Then the ternary relation (A, 0) is not an operation because there is no c 
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so that 0(1,0,c). Next, suppose A = Re and 
o(a, b,c) <c =Vab. 


Then (A, ©) is not an operation because there is no c so that 0(2, ~2,c). If 
we again take A = Re and now define 


o(a, b,c) ec =V {abl , 


then (A, 0) is still not an operation, for 0(2, 2, 2) and 0(2, 2, —2). However, 
(A, 9) is an operation if we only allow the positive square root. Another 
operation 0 on A = Re can be defined by taking 


o(a, b,c) ac =at2b. 


If (A, 0) is an operation and 0(a,b,c) holds, we usually write c=aob, 
Thus, in our present example, 5S=102 and 8=203. 
To give two further examples, suppose A = {1,2,3} and we define 


R = {(1, 1, 1), CL, 2, 2), (1, 3, 1), (2, 1, 2), (2, 2, 1), 
(2, 3, 2), (3, 1, 1), (3, 2, 2), (3, 3, 1)} 


and 


S = {(1, 1, 1), 1, 2, 1), (1, 3, 1), (2, 1, 2), (2, 2, 2), 
(2, 3, 2), (3, 1, 3), (3, 2, 3), (1, 3, 2)}. 


Then (A, R) is a binary operation: it is the operation that assigns to a and 
b the number | if a + b is even and the number 2 if a + 5 is odd. (A, S) is 
not a binary operation, since both S(1, 3, 1) and S(I, 3, 2). 

To give still another example, which we shall encounter later, suppose A 
is the set of aircraft engines discussed above, among which we are 
interested in comparing subjective loudness. If a and 5 are two engines, let 
us think of a0} as the object consisting of both engines placed next to 
each other. As far as loudness is concerned, the loudness of ao b is 
compared to other loudnesses by running both engines at once. We might 
call © an operation of “combination” and speak of comparing combined 
loudness. Unfortunately, 0 does not allow us to define an operation on the 
set A in the precise sense we have defined. We would think of defining the 
ternary relation 0(a, b,c) which holds if and only if c is ao 5, Unfor- 
tunately, if a and 5 are in A, a 0 b is not necessarily in A, and so there is 
no c for which 0(a, b, c) holds, violating condition (1.22). Moreover, the 
combination a 0a does not make sense, and yet an operation must be 
defined on all pairs in A X A including pairs (a, a). 
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(To rectify this situation, we can think in terms of a hypothetical set B 
consisting of infinitely many copies of each element a of A, and to speak 
of a set C consisting of finite sets of elements from B. Two sets in C are 
considered equivalent if for each element a in A the two sets have the same 
number of copies of a. Then equivalence is indeed an equivalence 
relation. Let C* be the collection of equivalence classes. If a and f are in 
C*, pick disjoint representatives x in a and y in B and define a 0 B to be 
xUy. It is not hard to show that 0 is well-defined and defines an 
operation on C*.) 


Exercises 


1, Suppose A is the set of positive integers. 
(a) Show that R(a, b, c) iff a + b = c defines an operation on A. 
(b) Show that S(a, b, c) iff a — b = c does not define an operation 
on A. 
(c) Show that T(a, b, c) iff a + b + c = 0 does not define an opera- 
tion on A. 


2. Suppose A is all of the integers. 
(a) Is the relation (A, R) of Exer. 1 an operation? 
(b) What about the relation (A, S) of Exer. 1? 
(c) What about the relation (A, T) of Exer. 1? 
3. Which of the following relations (A, U) define operations? 
(a) A = Re, U(a, b,c) iff at+b+c=0. 
(b) A = Re, U(a, b,c) iff abc = 0. 
(c) A = Re, U(a, b,c) iff a>b>c. 
4. If A = (0, 1, 2} and 


R = {(0, 0, 0), (0, 1, 0), (0, 2, 0), (1, 0, 0), (1, 1, 1), 
(1, 2, 2), (2, 0, 0), (2, 1, 2), (2, 2, 1)}, 


show that (A, R) is an operation. (What operation is it?) 


5. (a) If A = {Los Angeles, Chicago, New York}, show that the follow- 
ing relation on A is not a binary operation: 


R = {(Los Angeles, Chicago), (Los Angeles, New York), 
(Chicago, New York) }. 


(b) Which of the following relations on A is a binary operation? 
(i) S given by the following set of triples: 
(Los Angeles, Los Angeles, Chicago) 
(Los Angeles, Chicago, Chicago) 
(Los Angeles, New York, Chicago) 
(Chicago, Chicago, Chicago) 
(Chicago, New York, Chicago) 
(New York, New York, Chicago). 
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Gi) T= {(x, y, NewYork): x,y € A} U 
{(Chicago, Chicago, Chicago)}. 
6. (a) Show that the following binary relations (A, R) are functions: 
(i) A = Re, R = {(a,a + l): a € A}. 
(ii) A = Re,aRb iff b= a’. 
(iii) A = Re,aRb iff a=b. 
(b) Show that the following binary relations are not functions: 
(i) A = Re, aRb iff a=b*. 
(ii) A = Re,aRb iff a>b. 
(c) Which of the following binary relations are functions? 
(i) A = Re,aRb iff 6a4+2b=0. 
(ii) A = Re,aRb iff a>b+ 1. 
(iii) A = Re, aRb iff a divides b. 
7. Which of the following quaternary relations (A, R) correspond to 
functions from A X A X A to A? 
(a) A = {SO,, DDT} 
R = {(SO,, SO,, SO,, NO,), (NO,, NO,, NO,, SO,)}. 
(b) A = Re, R = {(a, b, c, d): d =Vabce }. 
(c) A = Re, R= {(a,b,c,d):d >a+b+ c}. 
8. Suppose R is a function on A and S is a function on A. 
(a) Is the relative product ROS necessarily a function on A? 
(b) How does the notion of relative product compare to the usual 
notion of composition of two functions? 


1.8 Relational Systems and the Notion of Reduction 


Suppose R,, R2,...,R, are (not necessarily binary) relations on the 
same set A and 0), 0, ..., 0, are binary operations on A. We shall call the 
(p + q + 1)-tuple U = (A, Rj), Ry, ..., R,, %, 02, ..., J) a relational sys- 
tem. Of course, we could treat binary operations as relations, and so simply 
speak of a relational system as a (p + 1)-tuple (4, R,, Ro, ..., R,). How- 
ever, in what follows, it will be convenient to single out the binary 
operations. On the other hand, we do not single out any other functions. 

It will be useful to generalize the reduction or quotient procedure of 
Section 1.5 to relational systems 2. The procedure is called by Scott and 
Suppes [1958] the method of cosets. Let us start with a relational system 
(A, R,, R2,..., R,) having no operations. We define a relation of equiva- 
lence E on A by saying that aEb holds if and only if a and b are “perfect 
substitutes” for each other with respect to all the relations R ,;. (Formally, a 
and b are perfect substitutes for each other with respect to an m-ary relation 


R, if the following condition holds: given sequences (a,, a,,...,4,,) and 
(b,, b2,...,5,,) from A, if for 7 = 1,2,...,m, a #b, implies {a,, b} = 
{a, b}, then 


Ra, a,...,4,) iff R,(b,, b,...,5,)- 


am 
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This definition, and the notion of reduction defined below, appear in Scott 
and Suppes [1958].) It is not hard to show that if a* is the equivalence class 
containing a, then the following relations R* are well-defined on the 
collection A* of equivalence classes. 


R*(a*, ad,...,a%) iff R,(a,, a,,...,4,)- 


The relational system 2* = (A*, R¥, R¥,..., RS) is called the reduction 
or quotient of M, and is often denoted as U/L. The reader should verify 
that if (A, R) is a strict weak order, then the perfect substitutes relation E 
is the same as the tying relation E defined in Eq. (1.19) of Section 1.5. 

Handling the general case, suppose we are given a relational system 
W = (A, Rj, Ry, ..-, Ry, %, %, «.- 5 Og), and aEb holds for a, b in A if 
and only if a and b are perfect substitutes for each other with respect to all 
the relations R,. Let us define binary operations of on A* as follows: 


a* o* b* = (ao, b)*. 
The operation 0* is well-defined provided the following condition holds: 
(aEa’ & bEb’) = (a 0, b)E(a’ 0, ’). (1.24) 


If (1.24) holds for all i, we say the relational system Y is shrinkable, and we 
call the relational system %* = (A*, R¥, RF,..., Ry, Of, OF,..., Of) 
the reduction of &. We say that a relational system % is irreducible if every 
equivalence class with respect to E has exactly one element. 

We illustrate these ideas with an example. Let A = {0, 1, 2,..., 26}, 
and define R and o on A by 


aRb iff amod3_> b mod 3, (1.25) 
c=a0b iff [c=a-+ b(mod 3)andc € {0,1,2}]. (1.26) 
Thus, 2R1, 8R10, etc. Similarly, 202 = 1, 603 =0. There are three 
equivalence classes under E, namely: 
* = {0, 3, 6, 9, 12, 15, 18, 21, 24}, 
1* = {1, 4, 7, 10, 13, 16, 19, 22, 25}, 
and 
* = {2, 5, 8, 11, 14, 17, 20, 23, 26}. 
We have 2*R*1*, since 2R1. Equation (1.24) holds, and hence o* is 
well-defined, and % = (A, R, ©) is shrinkable. We have, for example, 


2* o* 2* = |*, since 20 2 = 1. We shall return to the idea of reduction in 
Chapter 2. 
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Exercises 
1. If & = (A, E) is an equivalence relation, what is the reduction 2*? 
2. Suppose A = Re X Re, 


(a, b)R(c,d)e@a>c, (1.27) 
and 
(a, b)S(c,d)ea=c. (1.28) 


(a) Identify the reduction (A* R*). 
(b) Identify the reduction (A* S*). 
(c) Identify the reduction (A*¥ R* S*). 
3. Suppose A = Re X Re, R is defined on A by Eq. (1.27), and 


(a, b)S(c,d) = b > d. 


Identify the reduction (A% R¥ S*). 

4. Suppose A = Re, R => , ando= + . Show that (A, R, 0) is shrink- 
able. 

5. Are the following relational systems shrinkable? If so, find their 


reductions. 
(a) (A, R, 0), where A 
(b) (A, R, 0), where A 


(a, b)o(c,d)=(a+c,b +d). 


Ret+t,R=>,0=~xX. 
Re X Re, R is defined by Eq. (1.27), and 


(c) (A, R, 0, 0’), where A, R, and 0 are as in (b) and 
(a, b) 0’ (c, d) = (ac, bd). 


(d) (A, R, S, 0), where A, R, and 0 are as in (b) and S is defined by 
Eq. (1.28). 

(e) (A, R, S, 0, 0’), where A, R, S, and 0 are as in (d) and 0’ is as in 
(c). 
(f) (A, R, 0), where A = {0, 1, 2,..., 26}, R =>, and 0 is defined 
by Eq. (1.26). 

6. Show that if (A, R) is a strict weak order, then the perfect substitutes 

relation E is the same as the tying relation of Eq. (1.19) of Section 1.5. 


7. Show that the reduction 2{* is always irreducible. 
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CHAPTER 2 


Fundamental Measurement, 
Derived Measurement, and the 
Uniqueness Problem 


2.1 The Theory of Fundamental Measurement 
2.1.1 Formalization of Measurement 


In this chapter, we introduce the theory of fundamental measurement 
and the theory of derived measurement, and study the uniqueness of 
fundamental and derived measures. Fundamental measurement deals with 
the measurement process that takes place at an early stage of scientific 
development, when some fundamental measures are first defined. Derived 
measurement takes place later, when new measures are defined in terms of 
others previously developed. In this section, we shall begin with fundamen- 
tal measurement. Derived measurement will be treated in Section 2.5. Our 
approach to measurement follows those of Scott and Suppes [1958], Suppes 
and Zinnes [1963], Pfanzagl [1968], and Krantz et al. [1971]. 

Russell [1938, p. 176] defines measurement as follows: “Measurement of 
magnitudes is, in its most general sense, any method by which a unique 
and reciprocal correspondence is established between all or some of the 
magnitudes of a kind and all or some of the numbers, integral, rational, or 
real as the case may be.” Campbell [1938, p. 126] says that measurement is 
“the assignment of numerals to represent properties of material systems 
other than number, in virtue of the laws governing these properties.” To 
Stevens [1951, p. 22], “measurement is the assignment of numerals to 
objects or events according to rules.” Torgerson [1958, p. 14] says that 
“measurement of a property ... involves the assignment of numbers to 
systems to represent that property.” 
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What then is measurement? It seems almost redundant to say that 
measurement has something to do with assignment of numbers. (In Section 
6.1.4, however, we shall argue that measurement without numbers is a 
perfectly legitimate and useful activity.) All the above definitions suggest 
that measurement has something to do with assigning numbers that corre- 
spond to or represent or “preserve” certain observed relations. This idea 
fits paradigm examples of physics such as temperature or mass. In the case 
of temperature, measurement is the assignment of numbers that preserve 
the observed relation “warmer than.” In the case of mass, the relation 
preserved is the relation “heavier than.” 

More precisely, suppose A is a set of objects and the binary relation 
aWb holds if and only if you judge a to be warmer than b. Then we want 
to assign a real number f(a) to each a € A such that for all a, b € A, 


aWb = f(a) > f(b). (2.1) 


Similarly, if A is a set of objects which you lift and H is the judged relation 
“a is heavier than b,” then we would like to assign a real number f(a) to 
each a € A such that for all a, b € A, 


aHb © f(a) > f(b). (2.2) 


Measurement in the social sciences can be looked at in a similar manner. 
Thus, for example, measurement of preference is assignment of numbers 
preserving the observed binary relation “preferred to.” If A is a set of 
alternatives and aPb holds if and only if you (strictly) prefer* a to b, then 
we would like to assign a real number u(a) to each a € A such that for all 
abEA, 


aPb = u(a) > u(b). (2.3) 


The function u is often called a utility function or an ordinal utility function 
or an order-preserving utility function, and the value u(a) is called the utility 
of a. Measurement of loudness is analogous, and calls for an assignment of 
numbers to sounds preserving the observed relation “louder than.” So is 
measurement of air quality: we are trying to preserve an observed relation 
like “the air quality on day a was better than the air quality on day 5.” 
In the case of mass, we actually demand more of our “measure.” We 
want it to be “additive” in the sense that the mass of the combination of 
two objects is the sum of their masses. Formally, we need to speak of a 
binary operation 0 on the set A of objects—think of ao b as the object 


*Strict preference is to be distinguished from weak preference: the former means “better 
than,” the latter “at least as good as.” See Section 1.5. 
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obtained by placing a next to b.t We want a real-valued function f on A 
that not only satisfies condition (2.2) but also “preserves” the binary 
operation 0, in the sense that for all a, b € A, 


f(a ob) = f(a) + fd). (2.4) 


There is no comparable operation in the case of temperature as it is 
commonly measured.* Whether there is a comparable operation in the 
case of preference depends on the structure of the set of alternatives being 
considered, and on how demanding we want to be in our measurement. 
We might want to allow complex alternatives like paper and pencil (a 0 b), 
and we might want to require utility to be additive, that is, to satisfy 


u(a o b) = u(a) + u(b). (2.5) 


A utility function that is also additive is often called a cardinal utility 
function.“ 


2.1.2 Homomorphisms of Relational Systems 


Abstracting from these examples, let us use the concept of a rela- 
tional system introduced in Section 1.8. A relational system is an ordered 
(p + q + 1)-tuple Y = (A, R,, R,,..., R,, 9, 0,..., 0), where A is a 
set, Ri, R,, ...,R, are (not necessarily binary) relations on A, and 
0), 0,,...,0, are (binary) operations on A. The gype of the relational 
system is a sequence (7,, r2,.--, 7,3; 7) of length p + 1, where 7, is m if R, 
is an m-ary relation. For example, in the case of mass, we are dealing with 
a relational system (A, H, 0) of type (2; 1). In the case of temperature, we 
are dealing with a relational system (A, W) of type (2; 0). The relational 
system 2 = (Re, >, 2, +) has type (2, 2; 1). & is an example of what we 
shall call a numerical relational system, that is, one where A is the set of real 
numbers. A second example of a numerical relational system is 
(Re, >, +,X), which has type (2; 2). Although we have been very general 
in our definition of relational systems, we shall usually deal with only a 
small number of relations and operations. 

In our examples, we have seen that in measurement we start with an 
observed or empirical relational system 2 and we seek a mapping to a 
numerical relational system 8 which “preserves” all the relations and 


The formal difficulties are the same as those we encountered m combining aircraft 
engines in Section 1.7. 

*The exception is if we are dealing with the Kelvin notion of temperature. Later on, we 
shall want to make other demands in the measurement of temperature. 

In the literature, a utility function is often called cardinal if it gives rise to a scale at least 
as strong as what we shall call an interval scale. 
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operations in 2{. For example, in measurement of mass, we seek a mapping 
from X = (A, H, 0) to 8 = (Re, >, +) which “preserves” the relation H 
and the operation 0. In measurement of temperature, we seek a mapping 
from & = (A, W) to B = (Re, >) which “preserves” the relation W. A 
mapping f from one relational system 2 to another 8 which preserves all 
the relations and operations is called a homomorphism. To make this 


precise, suppose 8 = (B, Rj, Ry,..., Ry, 01, 05,..., 05) is a second rela- 
tional system of the same type as %&. A function f:A > B is called a 
homomorphism from % into ® if, for all a,, a,,..., a, € A, 


R,(ay, ay, ..., 4,) > Ri[ f(a), f(a), ---,f(a,)], = 1,2,...,p, (2.6) 
and for all a, b € A, 
f(a 0; b) = f(a) of f(b), i=1,2,...,4¢. (2.7) 


(The function f need not be one-to-one or onto.) For example, in the case 
of mass, a function f that satisfies Eqs. (2.2) and (2.4) gives a homomor- 
phism from the observed relational system (A, H, 0) into the numerical 
relational system (Re, >, +). A one-to-one homomorphism will be called 
an isomorphism. If there is a homomorphism from 2% into 8, we say % is 
homomorphic to 8. If there is an onto isomorphism, we say 2% is isomorphic 
to B. 

It is important to make one remark about the definition of homomor- 
phism. If 0; and 0; were considered ternary relations on A and B, 
respectively, then the condition corresponding to Eq. (2.6) would read 


(a, b,c) € 0, = [ f(a), f(b), fc) ] © 9. (2.8) 


Equation (2.7) is the implication = of Eq. (2.8), but it does not correspond 
exactly to (2.8). We shall show this by example below. 

To give a concrete example of a homomorphism, suppose A = 
{0, 1, 2,..., 26}, and we define R on A by 


aRb = a mod 3 > b mod 3. 


(The reader will recall that a mod 3 is the number among 0, I, 2 that is 
congruent to a, modulo 3.) Then the function f(a) = a mod 3 defines a 
homomorphism from % = (A, R) into B = (Re, >). We have f(1l) = 
2, {(6) = 0, etc. To give a second example, suppose A = {a, }, c, d} and 


R = {(a, db), (8, c), (a, c), (a, d), (6, d), (ce, d)}. 


Then a homomorphism from 2% = (A, R) into 8 = (Re, >) is given by 
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J(a) = 4, f(b) = 3, f(c) = 2, f(d) =1. A second homomorphism is given by 
g(a) = 10, g(b) = 4, 9(c) = 2, g(d) = 0. Both f and g are isomorphisms. 
Next, suppose A = {a, b, c} and 


R = {(a, b), (b, c), (c, a)}. 


Then there is no homomorphism f from (A, R) into (Re, >), since aRb 
implies f(a) > f(b), bRc implies f(b) > f(c), and cRa implies f(c) > f(a). 

To give several additional examples, if A = Re and B = Re, then f(a) = 
— a gives a homomorphism from (A, >) into (B, <). If A= 
{0, 1,2,...} and B = {0,2,4,... }, then f(a) = 2a defines a homomor- 
phism from %& = (A, >, +) to 8 = (B, >,+). A second homomorphism is 
given by g(a) = 4a. (There is no requirement that a homomorphism must 
be an onto function.) If 2 = (Re, >, +) and 8 = (Ret, >, X), where 
Re* is the positive reals, then f(a) = e* defines a homomorphism from 2 
into %. For 


a>bee>e?’ 


and 


ettb = e? x e?. 


To give still another example, suppose 


A = {a,b, c,d, e}, R = {(b, c), (d, a)} 
and 


B = {plane, train, car, bus, bicycle}, S = {(car, bus), (plane, bicycle)}. 


Then the function f:A — B defined by f(a) = bicycle, f(b) = car, f(c) = 
bus, f(d) = plane, f(e) = train is a homomorphism, indeed an isomor- 
phism, from (A, R) into (B, S). 

To give one final example, let A = {x,y} and let 0 be the operation 
defined as follows: 


xXOx=x,xOy=x, yOx=x,yoy=y. (2.9) 


Then the function f defined by f(x) = f(y) = 0 is a homomorphism from 
(A, 0) into (Re, +), that is, it satisfies Eq. (2.7). However, f does not satisfy 
Eq. (2.8), since f(y) = f(x) + f(x), yet y #x Ox. Similarly, if R is the 
empty relation on A, then f is a homomorphism from (A, R, 0) into 
(Re, >, +), even though f violates (2.8). 
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2.1.3 The Representation and Uniqueness Problems 


In general, we shall say that fundamental measurement has been per- 
formed if we can assign a homomorphism from an observed (empirical) 
relational system 2 to some (usually specified) numerical relational system 
%. Thus, measurement of temperature is the assignment of a homomor- 
phism from the observed relational system (A, W) to the numerical rela- 
tional system (Re, >), measurement of mass is the assignment of a 
homomorphism from the observed relational system (A, H,0) to the 
numerical relational system (Re, >, +), and so on. 

The difficult philosophical question—not a mathematical question—is 
the specification of the numerical relational system 8. Why try to find a 
homomorphism into one relational system rather than another? The 
answer will depend on a combination of intuition and theory about what is 
being measured, and desired properties of the numerical assignment. If we 
have a homomorphism, the homomorphism is said to give a representation, 
and the triple (2, 8, f) will be called a scale, though sometimes we shall be 
sloppy and refer to f alone as the scale. The reader may wish to formulate 
in these terms other measurement problems, for example, measurement of 
length, of area, or of height.* 

The first basic problem of measurement theory is the representation 
problem: Given a particular numerical relational system 8, find conditions 
on an observed relational system 2% (necessary and) sufficient for the 
existence of a homomorphism from 2% into 8. The emphasis is on finding 
sufficient conditions. If all the conditions in a collection of sufficient 
conditions are necessary as well, that is all the better. A more important 
criterion is that the conditions be “testable” or empirically verifiable m 
some sense. (Often it is desired that the conditions be statable in the form 
of a law expressible using only universal conditionals.) In any case, the 
conditions are usually called axioms for the representation, and the theo- 
rem stating their sufficiency is usually called a representation theorem. If 
possible, the proof of a representation theorem should be constructive; it 
should not only show us that a representation is possible, but it should 
show us how actually to construct it. A typical axiom for a representation 
theorem is the following. Suppose we seek a homomorphism f from (A, R) 
into (Re, >). If such a homomorphism f exists, it follows that (A, R) is 
transitive. For if aRb and bRc, then f(a) > f(b) and f(b) > f(c), from which 
J(a) > f(c) and aRc follow. Thus, a typical measurement axiom is the 
requirement that (A, R) be transitive. 


*We have chosen to allow all possible real numbers as scale values. In practical measure- 
ment, of course, only rational numbers are ever needed. However, it is convenient theoreti- 
cally to allow all possible real values. Later on, we shall modify our position, and consider 
scales that assign mathematical objects other than numbers. 
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In Sections 3.1, 3.2, and 3.3, we shall present some basic representation 
theorems. Further representation theorems will be proved in other 
chapters. 

The axioms for a representation give conditions under which measure- 
ment can be performed. The axioms can be thought of as giving a 
foundation on which the process of measurement is based. In a less global 
sense, the axioms can also be thought of as conditions that must be 
satisfied in order for us to organize data in a certain way. In any case, it is 
important to be able to state such foundational axioms, at least for 
measurement in the social sciences. For we must know under what 
circumstances certain kinds of scales of measurement can be produced. In 
the physical sciences, the situation is different. We by now have well- 
developed scales of measurement, and writing down a representation 
theorem for these scales is often more a theoretical exercise than a 
significant practical development. 

The second basic problem of measurement theory is the uniqueness 
problem: How unique is the homomorphism f? We shall see later in this 
chapter that a uniqueness theorem tells us what kmd of scale fis, and gives 
rise to a theory of meaningfulness of statements involving scales. In 
particular, a uniqueness theorem puts limitations on the mathematical 
manipulations that can be performed on the numbers arising as scale 
values. As Hays [1973, p. 87] points out, one can always perform mathe- 
matical operations on numbers (add them, average them, take logarithms, 
etc.). However, the key question is whether, after having performed such 
operations, one can still deduce true (or better, meaningful) statements 
about the objects being measured. 

Sometimes we shall start with a desired uniqueness result and work 
backwards. That is, we shall seek a representation that will lead to that 
result. For example, in measurement of temperature, we shall start with the 
observation that temperature is measured up to determination of an origin 
and of a unit, and ask what representations will give rise to a scale of 
temperature that has these properties. 

This chapter will emphasize the uniqueness problem and applications of 
the theory of uniqueness. In Chapter 3 we shall begin our study of the 
representation problem. 

It should be pointed out in closing this subsection that not all of the 
theory of measurement, even that in the spirit of this work, fits perfectly 
into the framework we have described. One sometimes deals with systems 
formally different from the relational systems we have defined—for exam- 
ple, systems having several underlying sets (see Section 5.6), or having 
operations defined only on subsets of the underlying set (see, for example, 
Krantz ef al. [1971, Section 3.4]), or having sets with structure such as 
Boolean algebras and vector spaces (see, for example, Narens [1974a] and 
Domotor [1969]. One sometimes modifies the notion of homo- 
morphism—for example, using Eq. (2.8) in place of Eq. (2.7), or modifying 
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(2.6) to read 
R(a, ay, +--+, a,) = Ri[ f(a), f(a), tae »f(a,) |. 


(See, for example, Adams [1965].) One sometimes deals with representa- 
tions into relational systems where the underlying set is not the set of real 
numbers. This can involve as simple a change as using the rationals rather 
than the reals, or it can be as complex as using the nonstandard reals or 
systems with non-Archimedean properties. (See, for example, Skala [1975] 
or Narens [1974a,b].) It can also involve maps of objects into sets of 
numbers such as intervals, or into geometric figures such as rectangles and 
circles (see Section 6.1.4). Not as much work has been done in any of the 
directions mentioned as in the specific framework we have outlined for 
measurement. However, it will probably be fruitful to develop many of 
these alternative directions in the future. In all these directions, the central 
idea is unchanged. This is the notion of representation, the translation of 
“qualitative” concepts such as relations and operations into appropriate 
“quantitative” (or other concrete) relations and operations enjoying known 
properties. As we pointed out in the Introduction, it is from a representa- 
tion that we can learn about empirical phenomena, by applying the 
concepts and theories that have been developed for the representing 
relations to the represented ones. 


Exercises 


1. (a) Show in each of the following that the relational system 2 is 
homomorphic to the relational system G (2N is the set of even positive 
integers): 


x B 

re) (N, >) (2N, >) 
(ii) QN, 2) (N, 2) 
(iii) (N, >, +) QN, >, +) 
(iv) _ (Re, +) (Ret, X) 


(b) Determine which of the following relational systems % is homo- 
morphic to the corresponding relational system 8: 


x B 

(i) (N, =) (2N, =) 
(ii) (Re, +) (Re, —) 
(iii) (Z, >) (Z, <) 
(iv) (N, >) (Z, <) 
(v) (Re, =) (Re, 4) 


(vi) (Re*, >, X) (Re, >, +) 
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2. Show that there is no homomorphism from (N, >) into (VN, <). 
3. Recall that a homomorphism f is an isomorphism if it is one-to-one. 
(a) Show that there is an isomorphism from (N, >) into (2N, >). 
(b) Show that if A = {1, 2, 3} and R is {(1, 2), (J, 3)}, then (A, R) is 
homomorphic to (Re, >), but not isomorphic to (Re, >). 

4. Suppose A = B = Re, D(a, b,c, d) holds on A iff a+b >c+d, 
and K(a, b,c,d) holds on B iff a+b<c+d. Show that (A, D) is 
homomorphic to (B, K). 

5. Suppose A = {Tom, Dick, Harry, John}, and 


R = {(Tom, Dick, John), 
(Tom, Dick, Harry), (Tom, John, Harry), (Dick, John, Harry) }. 


Let B(u, v, w) hold on Re if and only if u <v <w. Is (A, R) homomor- 
phic to (Re, B)? 


6. (a) Suppose A = {0, 1, 2,..., 26} and R on A is defined by 
aRb iff amod3>5 mod 3. 
(For the definition of a mod 3, see Section 1.5.) Suppose 


c=aob iff [c =a + b(mod 3) and c € {0, 1, 2}]. 


Let f(a) = a mod 3. Show that f is a homomorphism from (A, R, 0) into 
(N, >, 0). 
(b) Is f a homomorphism from (A, R, 0) into (Re, >, +)? 
7. Formalize in terms of relational systems the theory of measurement 
of length, area, or height. 


2.2 Regular Scales 
2.2.1 Definition of Regularity 


In this section, we begin with four statements and consider which seem 
to make sense. In the remainder of this chapter, we shall develop a theory 
of scale type and meaningfulness that accounts for our observations, and 
apply the theory to a wide variety of more complex problems. The theory 
is aimed at telling us what manipulations of scale values are appropriate in 
the sense that they lead to results that have unambiguous interpretations as 
statements about the objects or phenomena being investigated. 

The four statements we shall consider are the following: 


Statement I: The number of cans of corn in the local supermarket at 
closing time yesterday was at least 10. 
Statement II: One can of corn weighs at least 10. 
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Statement III: One can of corn weighs twice as much as a second. 
Statement 1V: The temperature of one can of corn at closing time 
yesterday was twice as much as that of a second can. 


Statement I seems to make sense but Statement II does not, for the 
number of cans is specified without reference to a particular scale of 
measurement, whereas the weight of a can is not. Similarly, Statement III 
seems to make sense but Statement IV does not, for the ratio of weights is 
the same regardless of the scale of measurement used (if one can has twice 
as many grams as another, it has twice as many ounces, pounds, kilograms, 
etc.); whereas the ratio of temperatures is not necessarily the same (if one 
can is twice as many degrees Fahrenheit as another, it is not twice as many 
degrees centigrade). To be an adequate description of what we mean by 
measurement, a theory must account for observations such as these. 

In general, to account for such observations, we shall study the unique- 
ness of the numerical assignment (homomorphism) involved. It is quite 
possible, given two relational systems 2% and 8 of the same type, for there 
to be several different functions that map 2% homomorphically into %. 
Since this is the case, any statement about measurement should either 
specify which scale (which homomorphism) is being used or be true 
independent of scale. 

To make this statement more precise, let us recall that a scale is a triple 
(4, B, f) where WX and 8 are relational systems and f is a homomorphism 
from 2 into B. Let us call this scale a numerical scale if 8 is a numerical 
relational system, that is, the set underlying © is Re. We are interested in 
the uniqueness of the numerical assignment f. Specifically, we shall say 
that a statement involving numerical scales is meaningful if its truth (or 
falsity) remains unchanged if every scale (2, 8, f) involved is replaced by 
another (acceptable) scale (2, 8, g). Meaningful statements are unambigu- 
ous in their interpretation. Moreover, they say something significant about 
the fundamental relations among the objects being measured, whereas 
statements that are dependent on a particular, arbitrary choice of scale do 
not. 

Meaningfulness can be studied by analyzing admissible transformations 
of scale. Suppose f is one homomorphism from a relational system 2 into a 
relational system 8, and suppose A is the set underlying 2 and B is the set 
underlying 8. Suppose ¢ is a function that maps the range of f, the set 


f(A) ={ f(a): 4 € A}, 


into set B. Then the composition ¢ 0 f is a function from 4 into B. If ¢ of 
is a homomorphism from 2% into 8, we call ¢ an admissible transformation 
of scale. For example, suppose & = (N, >), 8 = (Re, >), and f:N > Re 
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is given by f(x) = 2x. Then f is a homomorphism from 2 into B. If 
(x) = x + 5, then ¢ 0 fis also a homomorphism from Y into B, for we 
have (¢ 0 f) (x) = 2x + 5, and 


x >y iff 2x+5 > 2y4+5. 


Thus, ¢:f(A) > B is an admissible transformation of scale. However, if 
ox) = — x for all x € f(A), then ¢ is not an admissible transformation, 
for ¢ 0 f is not a homomorphism from Y% into B. 

If (a, B, f) is any scale and (2, 8, g) is any other scale, then it is 
sometimes possible to find a function ¢:f(A) > B so that g = ¢o/f. For 
example, if YU, 8, and f are as above and g(x) = 7x, then $(x) =lx will 
suffice. If for every scale (2, 8, g), there is a transformation ¢:f(A) > B 
such that g = 0 f, then we shall call the scale (A, B, f) regular. If every 
homomorphism f from 2 into 8 is regular, we shall call the representation 
WY — B regular. A representation YU — % is regular if, given any two scales f 
and g, we can inap each into the other by an admissible transformation. 
Almost all the representations we shall encounter in this volume are 
regular. For the regular representation, there will be a very nice theory of 
uniqueness, of meaningfulness, and of scale type. In particular, if every 
scale in a statement is regular, we may use the following simpler definition 
of meaningfulness: A statement involving (numerical) scales is meaningful 
if and only if its truth or falsity is unchanged under admissible transforma- 
tions of all the scales in question. If not every scale is regular, we may 
salvage this definition of meaningfulness by generalizing our notion of 
admissible transformation. See Roberts and Franké [1976]. This modified 
definition of meaningfulness is originally due to Suppes [1959] and Suppes 
and Zinnes [1963]. For variants of it, see Robinson [1963], Adams, Fagot, 
and Robinson [1965], Pfanzag] [1968], and Luce [1978]. We shall test the 
definition against the examples given at the beginning of this section and 
then apply it to other, more complex examples. 

Before doing this, however, we present two examples of irregular scales. 
Define a binary relation >, on Re by 


x>y iff x>y +1. (2.10) 
Let 
A = {r,s,t} and R= {(r,5), (7, D}. (2.11) 
Let 
f(r) = 2, f(s) = 0, f() = 0, g(r) = 2, g(s) = 0.1, g(t) = 0. (2.12) 


Then f and g are homomorphisms from YW = (A, R) into B = (Re, >), as 
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is easy to verify. However, there is no function $:f(A) — Re such that 
g=o0f. For (PO f)(s) = (6 0 f)(2), while g(s) ¥ g(t). (Homomorphisms 
into (Re, >)) will play a crucial role in Section 6.1 in our study of utility 
functions if indifference is not transitive.) 

To give a second example of an irregular scale, define a binary relation 
M on Re by 


xMy iff [x=y—lory=x—-lorx=y]. (2.13) 

Let 
A= {rs} and R=({(r,7), (8,5) (ns) (ss). (2.14) 

Let 
f(r) = 0, f(s) = 0, g(r) = 0, g(s) = 1. (2.15) 


Then f and g are homomorphisms from & = (A, R) into B = (Re, M), as 
the reader can readily verify. However, there is no function $:f(A) > Re 
such that g = ¢ of. For (6 0 f)(r) = (¢ 0 f)(s), while g(r) ¥ g(s). 

These two examples suggest the following characterization of regular 
scales. See Exer. 10 for a related result. 


THEOREM 2.1 (Roberts and Franke [1976]). (2, 8, f) is regular if and only 
if for every other homomorphism g from UX into B, and for all a,b in A, 


f(a) = f(b) implies g(a) = g(b). 


Proof. If f is regular, then there is a function $:f(A) > B such that 
g= of. Thus, f(a) = f(b) implies g(a) = ¢[f(@)] = [f()] = (4). 
Conversely, given g, define ¢[f(a)] = g(a). Then, since f(a) = f(b) implies 
g(a) = g(b), > is well-defined. Moreover, g = ¢ 0 f, so fis regular. | 


COROLLARY. Every isomorphism is regular. 
2.2.2 Reduction and Regularity 


The Corollary to Theorem 2.1 provides us with a means for avoiding the 
difficulty posed by irregular scales. The idea is to use the process of 
reduction described in Section 1.8 to guarantee that all homomorphisms 
are isomorphisms. We briefly sketch the idea. 

Given a relational system YU = (A, Rj, Ro, ..., R,, 0, O2,..., Og), we 
define a binary relation E, the “perfect substitutes” relation, on A. Then if 


(aEa’ & bEb’) = (a 0, b)E(a’ 0, b’), alli, (2.16) 
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we say & is shrinkable and define a reduction UA* of UW. A* is a relational 
system (A*, R*, R¥,..., R}, of, OF,..., OF) with relations R* and op- 
erations 0* defined on the set A* of equivalence classes under E. See 
Section 1.8 for details. 

It is interesting to observe that if 2 is irreducible—that is, if every 
equivalence class with respect to E has exactly one element—then it is 
shrinkable and % is isomorphic to Y%* using the function f(a) = a*. 
Moreover, it is interesting to note that if f is a homomorphism from a 
relational system YX into a relational system %, then f(a) = f(b) implies 
aEb. Thus, if 2 is irreducible, then every homomorphism f from % into B 
is an isomorphism, for f(a) = f(b) implies aEb, which implies a = b. 
Finally, suppose 2 is homomorphic to 8 via a homomorphism f, and 2% is 
shrinkable. Then we can find a homomorphism F from Y* to B by letting 
F(a*) be f(a) for some representative a of a*. 

These ideas provide us with a way of avoiding irregular scales. If 2 is 
irreducible, we know that every homomorphism from % to a relational 
system % is an isomorphism, and so must be regular by the Corollary to 
Theorem 2.1. If 2 is not irreducible but is shrinkable, and & is homomor- 
phic to 8, then 2%* is homomorphic to B. Since %* is always irreducible 
(Exer. 7, Section 1.8), every homomorphism from %* to 8 is an isomor- 
phism, and so the representation 2* > % is regular. Thus, to guarantee 
that all scales in question are regular, it is sufficient first to reduce the 
relational systems in question by canceling out the perfect substitutes 
relation E. 

Let us illustrate these ideas by taking A = {0, 1, 2,..., 26} and defin- 
ing R and o on A by 


aRb iff amod3 >b mod 3, 


c=aob iff [c =a + b (mod 3) and c € {0, 1, 2}], 


where a mod 3 is defined as that number in {0, 1, 2} which is congruent to 
a modulo 3. Then, as we pointed out in Section 1.8, 2 = (A, R, 0) is 
shrinkable. If 0’ is addition mod 3 on {0, 1, 2}, then it is easy to see that 2 
is homomorphic to 


B= ({0, 1, 2}, >, 0’) 


by the homomorphism f(a) = a mod 3. Moreover, %* is isomorphic to 8 
by the function F(0*) = f(0) = 0, F(I*) = f() = 1, FQ*) = fQ) =2. 
The scale (%*, 8, F) is regular. 
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Exercises 


1. (a) Suppose A = N and B = Q. Then (A, >) is homomorphic to 
(B, >) by the homomorphism f(a) = 2a. 
(i) Show that the function ¢(x) = 4x +3 is an admissible 
transformation of scale. 
(ii) Show that the function $(x) = 2* is also admissible. 
(iii) Show that the function ¢(x) = — x + 10 is not admissible. 
(b) Suppose A = N, B = Q, and f(a) = — a is a homomorphism 
from (A, >) to (B, <). Which of the following transformations are 
admissible? 
(i) $(x) = 4x + 3. 
(it) o(x) = 2°. 
(ili) $(x) = — x + 10. 
2. The function f(x) = e* is a homomorphism from (Re, +) to 
(Re, X). Which of the following ¢ are admissible transformations of scale? 


(a) $(x) = 2x. 
(b) $(x) = x + 5. 
(c) o(x) = 2". 


3. Suppose A, R, 0, and f are defined as in Exer. 6, Section 2.1, and f is 
considered a homomorphism from (A, R, 0) into (N, >, 0). 
(a) Show that one admissible transformation of f is the function ¢ 
defined by (0) = 0, ¢(1) = 1, (2) = S. 
(b) Which of the following transformations are admissible? 
(i) $(x) = x + 2. 
(il) $(x) = 2x. 
(iti) o(x) = 3x. 
(iv) $(x) = x’. 
(c) If f is instead considered a homomorphism from (A, R,0) into 
(B, >,0), where B = {0, 1, 2}, what are all admissible transformations $? 
4. (a) Suppose A = {x,y,z} and R = {(x, y), (z, y)}. Let S be de- 
fined on Re by 


uSv eu >vt2. 


Show that the following functions are homomorphisms from (A, R) into 
(Re, S) and hence that ((A, R), (Re, S), f) is irregular: 


f(x) = 9, f(y) = 9, f(z) = 9, 
g(x) = 9, g(v) = 0, g(z) = 10. 


(b) Identify the reduction (A*, R*) and find a regular scale 
((A*, R*), (Re, S), F). 


5. (a) Define S as in Exer. 4. Let A = {x,y, z, w} and 


T= {(x, Z); (x, w), (y, z), (y, w)}. 
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Show that (A, T) is homomorphic to (Re, S) and that there is a homomor- 
phism f which defines an irregular scale. 
(b) Identify the reduction (A*, 7*) and find a homomorphism from 
(A*, T*) into (Re, S). 
6. Define A on Re by 


xAy iff |x —y| <1. 


A = {r,s, t}, R= {(r,r), (s, 5), (t, 0), (7, 5), (s, 9}, 


and let 


f(r) = 0, f(s) = 0, f(t) = 10, g(r) = 0, g(s) = 1/2, g(t) = 10. 


Conclude that ((A, R), (Re, A), f) is irregular. 


7. Suppose 0 is defined on Re by x Oy = min{x, y}. Let f: Re > Re 
be defined by 


-1 ifx<49, 
f(x)=4 0 if x =0, 
1 ifx>0O. 


(a) Show that f is a homomorphism from 2% = (Re, 0) into 8 = 
(Re, 9). 

(b) Show that (2, 8, f) is an irregular scale. 

(c) Is 2& shrinkable? If so, find 2* and a homomorphism F from * 
into B. 


8. Suppose A = {x,y} and 0 is defined as in Eq. (2.9). Show the 
following: 
(a) f(x) = 0, f(y) = 1 defines a homomorphism from % = (A, 0) 
into 8 = (Re, X). 
(b) (4, B, f) is regular. 
(c) However, the representation 2 — % is irregular. 


9. Given a system & = (A, R,, Ry, .-.., R,, 9, 9, ... , 0), suppose we 
define a binary relation E’ on A by aE’b if and only if a and 5 are perfect 
substitutes for each other with respect to all the relations R, and with 
respect to all the binary operations 0, considered as ternary relations. Show 
that if f is a homomorphism from % to B, then f(a) = f(b) does not imply 
that aE’b. [The example of Eq. (2.9) illustrates this point.] 


10. (Roberts and Franke [1976]) Let E be the perfect substitutes rela- 
tion. Prove that a representation 2 — B is regular if and only if, for every 
homomorphism f from 2 into 8, aEb implies f(a) = f(5). 
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2.3 Scale Type 


If a representation 2 > 8 is regular—that is, if all scales (2, B, f) are 
regular—then the class of admissible transformations defines how unique 
each such scale is, and can be used to define scale type. We shall assume 
throughout this section (unless mentioned otherwise) that all scales in 
question come from regular representations. The idea of defining scale 
type from the class of admissible transformations is due to S. S. Stevens 
[1946, 1951, 1959]. 

Table 2.1 gives several examples of scale types. The simplest example of 
a scale is where the only admissible transformation is ¢(x) = x. There is 
only one way to measure things in this situation. Such a scale is called 
absolute. Counting is an example of an absolute scale. 

To give a second example, let us suppose the admissible transformations 
are all the functions $:f(A) > B of the form (x) = ax, a > 0. Such a 


Table 2.1. Some Common Scale Types* 


Admissible Transformations Scale Type Example 
(x) = x (identity) Absolute Counting 
(x) = ax, a>O Ratio Mass 
Similarity transformation Temperature on the 
Kelvin scale 
Time (intervals) 
Loudness (sones)* 
Brightness (brils)* 
Ax) = ax+ B, a>O Interval Temperature 
Positive linear (Fahrenheit, 
transformation centigrade, etc.) 
Time (calendar) 


Intelligence tests, 
“standard scores” ? 


xZy iff (x) 2 y) Ordinal Preference? 
(Strictly) monotone Hardness 
increasing transformation Air quality 
Grades of leather, 


lumber, wool, etc. 
Intelligence tests, 
raw scores 


Any one-to-one ¢ Nominal Number uniforms 
Label alternative 
plans 
Curricular codes 


*See Table 2.2 for some other scale types. 
tAccording to the work of S. S. Stevens—see Chapter 4. 
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function ¢ is called a similarity transformation, and a scale with the 
similarity transformations as its class of admissible transformations is 
called a ratio scale. Mass defines a ratio scale, as we can fix a zero point 
and then change the unit of mass by multiplying by a positive constant. 
Thus, for example, we change from grams to kilograms by multiplying by 
1000. The term ratio scale arose because ratios of quantities on a ratio 
scale—for example, mass—make sense. Temperature also defines a ratio 
scale if we allow absolute zero, as in the Kelvin scale. Intervals of time (in 
minutes, hours, etc.) define a ratio scale. According to Stevens (see 
Chapter 4), various sensations such as loudness and brightness can also be 
measured in ratio scales. 

To give a third example, suppose we let the class of admissible transfor- 
mations be all functions ¢:f(A) > B of the form $(x) = ax + B,a > 0. 
Such a function is called a positive linear transformation, and a correspond- 
ing scale is called an interval scale. Temperature (as it is commonly 
measured) is an example of an interval scale. We vary the 0 point (this 
amounts to changing 8) and also the unit (this amounts to changing a). In 
this way, we can change, for example, from Fahrenheit to centigrade. (We 
take a = 5/9 and B = — 160/9.) Time on the calendar (for example, the 
year 1980) defines an interval scale. It is often argued that the “standard 
scores” from an intelligence test define an interval scale (Stevens [1959, p. 
25). 

Some scales are unique only up to order. For example, the scale of air 
quality being used in a number of cities is such a scale. It assigns a number 
1 to unhealthy air, 2 to unsatisfactory air, 3 to acceptable air, 4 to good air, 
and 5 to excellent air. We could just as well use the numbers 1, 7, 8, 15, 23, 
or the numbers 1.2, 6.5, 8.7, 205.6, 750, or any numbers that preserve the 
order. If a scale is unique only up to order, the admissible transformations 
are monotone increasing functions ¢(x), that is, functions $:f(A) > B 
satisfying the condition that 


x Zy = $(x) 2 o(y), 
or equivalently the condition 


x >y = o(x) > $(y). 


Such scales are called ordinal scales. Another example of an ordinal scale is 
the Mohs scale of hardness. Numbers are assigned to minerals, reflecting 
their relative hardness subject to the restriction that mineral a gets a larger 
number than mineral b if and only if mineral a is harder than mineral b. 
(In practice, a is judged to be harder than b if a scratches b.) Raw scores 
on an intelligence test probably define only an ordinal scale (Stevens [1959, 
p. 25]). Later we shall see that preference may be no more than an ordinal 
scale. 
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Finally, in some scales, all one-to-one functions ¢ define admissible 
transformations. Such scales are called nominal. Examples of nominal 
scales are numbers on the uniforms of baseball players or the numbering 
of alternative plans as plan 1, plan 2, etc. Many coding systems such as 
curricular codes used in college catalogues to identify the department 
define nominal scales. The actual number has no significance, and any 
change of numbers will contain the same information: identification of the 
elements of the set A. 

In general, the scale types listed in Table 2.1 go from “strongest” to 
“weakest,” in the sense that absolute scales and ratio scales contain much 
more information than ordinal scales or nominal scales. It is often a goal 
of measurement to obtain as strong a scale as possible. 

It is interesting to note that measurement can progress from lower to 
higher scale types. Stevens [1959, p. 24] gives a very nice discussion of this 
point. Early men probably distinguished only between cold and warm, 
thus using a nominal scale. Later, degrees of warmer and colder might 
have been introduced, corresponding to various natural events. This would 
give an ordinal scale. Later, introduction of thermometers led to interval 
scales of temperature. Finally, the development of thermodynamics led to 
a ratio scale of temperature, the Kelvin scale. 

There are several less common scale types which are important in the 
social sciences, and we mention them briefly. (See Table 2.2.) A scale is 
called a log-interval scale if the admissible transformations are functions of 
the form ax*, a, 8 > 0. Log-interval scales are important in psycho- 
physics, where they are considered as scale types for the psychophysical 
functions relating a physical quantity (for example, intensity of a sound) to 
a psychological quantity (for example, loudness of a sound). We shall 
encounter these psychophysical functions and log-interval scales in 
Chapter 4. 

Another less common scale type is the difference scale. Here, the admissi- 
ble transformations are functions of the form ¢(x) = x + B. We shall not 
encounter many difference scales in this volume. Suppes and Zinnes [1963, 
Section 4.2] give an example from the psychological literature, the so-called 
Thurstone Case V scale, which is a measure of response strength. Dif- 
ference scales also arise when we make logarithmic transformations of 
ratio scales. For example, if f(x) measures the mass of x, then f is unique 


Table 2.2. Some Other Scale Types 


Admissible Transformations Scale Type Example 
(x) = ax?, a, B >0 Log-interval Psychophysical 
function 


Mxy=x+B Difference Thurstone Case V 
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up to multiplication by a positive constant a. But log f is unique to 
addition of a constant 8, for multiplication of f(x) by a > 0 corresponds 
to addition to log f(x) of B = log a. (Similarly, log-interval scales corre- 
spond to exponential transformations of interval scales.) 

To illustrate the definition of scale type, let A = {r,s, t} and let R = 
{(r, 5), (s, 2), (r, )}. Then (A, R) is homomorphic to (Re, >). One homo- 
morphism is given by f(r) = 2, f(s) = 1, f() = 0. By the Corollary to 
Theorem 2.1, ((A, R), (Re, >), f) is regular. Moreover, $:f(A) > Re is 
admissible if and only if (6 0 f)\(r) > (6 0 f)(s), @ o f(s) > 0 f)(1), and 
(60 f\(r) > (6 0 f)(t) — that is, if and only if ¢(2) > ¢(1), C1) > ¢$(0), 
and ¢$(2) > ¢(0). Thus, @ is admissible if and only if ¢ is a monotone 
increasing function on f(A). Thus, f is an ordinal scale. 

Before closing this section, let us observe that if 2 — © is not a regular 
representation, then the notion of scale type runs into trouble. For exam- 
ple, there can be two homomorphisms f and g from & to 8 such that one is 
one type of scale and the other is another. However, if 21 > 8 is a regular 
representation, this problem does not arise.* 


THEOREM 2.2 (Roberts and Franke [1976}). If the representation U > B is 
regular and f and g are homomorphisms from XU to B, then f is an absolute, 
ratio, interval, ordinal, or nominal scale if and only if g is, respectively, an 
absolute, ratio, interval, ordinal, or nominal scale. 


Proof. We shall give the proof for the ordinal case. Suppose f is an 
ordinal scale. By regularity, we know that g = ¢ 0 f, for some ¢:f(A) > B. 
Since g is a homomorphism, ¢ is an admissible transformation of f. 
Moreover, since f is an ordinal scale, ¢ must be monotone increasing. To 
show that g is an ordinal scale, suppose first that ’:g(A) — B is monotone 
increasing. Then ¢’ is an admissible transformation of g. For ¢’ og = 
¢’ 0o(¢o0f) =(¢' 0g) of. Since ¢’0¢ is monotone increasing, it is an 
admissible transformation of f, and we conclude that ¢’ 0 g = (¢’ 0 ¢) of 
is a homomorphism from Y into 8. Thus, ¢’ is an admissible transforma- 
tion of g. Finally, suppose that $’:2(A) > B is an admissible transforma- 
tion of g. We show that ¢$’ is monotone increasing. For ¢’ 0g = 
¢’ 0 (¢ 0 f) = (¢’ 0 >) Of is a homomorphism from A into B, and so ¢’ 0 } 
is an admissible transformation of f. Therefore, ¢’0¢ is a monotone 
increasing function on f(A). Now if x, y are in g(A) and x >, we shall 
show that $'(x) > ¢’(y). We know that x = ¢(u) and y = ¢$(v) for some 
u, v in f(A). Moreover, since @ is monotone increasing, x > y implies that 
u>v. Finally, since ¢’0¢ is monotone increasing, it follows that 


($' 0 $)(u) > ($' © )(v), that is, (x) > $y). a 


*After reading the statement of the next theorem, the reader may wish to skip immediately 
to the remarks at the end of this section, which he may do with no loss of continuity. 
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To show that the definition of scale type can lead to dilemmas for 
irregular scales, let us recall the homomorphisms f and g from (A, R) into 
(Re, M), where f, g, A, R, and M are as defined in Egs. (2.13) through 
(2.15). The representation (A, R) > (Re, M) is irregular, since g is not 
¢ 0 f, any $:f(A) > Re. The scale f is ordinal, for every transformation of 
f(A) is monotone increasing, and every monotone increasing transforma- 
tion $:f(A) > Re is an admissible transformation.* However, g is not an 
ordinal scale, since the monotone increasing function ¢:g(A) > Re defined 
by $(x) = 2x is not admissible. For (¢ 0 g)(r) = 0, (¢ 0 g)(s) = 2, and not 
0M2, even though rRs holds. Thus, ¢0g is not a homomorphism from 
(A, R) into (Re, M), and ¢ is not admissible. 

This example is a bit unsatisfying because the range of f, f(A), has just 
one element. The following example, due to Roberts and Franke [1976], 
should be more convincing. Define R on Re by 


xRy iff [(x <0 and y 20) or (x =0 and y >0)]. (2.17) 


That is, all negative numbers are in the relation R to all nonnegative 
numbers, and 0 is in the relation R to all positive numbers. Let 2 = 
(Re, R, X), where X is ordinary multiplication on Re. Define g: Re > Re 
and f: Re > Re by 


-1l1 ifx<9, 
g(x)=, 0 if x =9, (2.18) 
1 if x>0O, 


and f(x) = x, for all x © Re. Then f and g are clearly homorphisms from 
W into B = (Re, R, X). Moreover, we shall show that (WU, 8, g) is an 
absolute scale, while (21, 8, f) is not. The latter follows, since ¢(x) = g(x) 
is an admissible transformation. To show the former, suppose ¢:g(Re) > 
Re is an admissible transformation. Then —1R0 and 0R1, so ¢(— 1) R¢(0) 
and $(0)R¢(1). Hence, by definition of R, ¢(— 1) must be negative, $(0) 
must be 0, and ¢(1) must be positive. Now 1 x 1 = 1, so @(1) X ¢(1) = 
(1). Hence, o(1) = 1. Moregver, (— 1) x (— 1) = 1, so @(— 1) X &-)D = 
o(1) = 1. Thus, ¢(—1) = — 1, since ¢(—1) is negative. Thus, we have 
shown that ¢ on g(Re) = {—1, 0, 1} is the identity. 

It would be interesting to find other examples of irregular representa- 
tions where one homomorphism with a nontrivial range gives rise to an 
ordinal, interval, or ratio scale, while another homomorphism does not. 
Such examples have not been given in the literature. It would also be 
helpful to determine whether broader conditions than regularity of a 
representation 2 > % are sufficient for all homomorphisms from 2 into B 
to have the same scale type. 


*Indeed, f is even an interval scale. (A scale can have several types.) 
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Remarks 


1. Given empirically obtained measurements, the problem of determin- 
ing what kind of scale they define cannot always be solved by using a 
formal approach of the type we have described. We may, for example, not 
have a specific representation in mind. In these cases, we must resort to an 
alternative definition of admissible transformation, as one that keeps 
“intact the empirical information depicted by the scale,” to follow Stevens 
[1968, p. 850]. The scale type, which is based on the definition of admissi- 
ble transformation, can then be difficult to determine in practice, because 
it involves capturing the vague notion of “empirical information” depicted 
by a scale. We shall see this in Section 2.6. As Adams, Fagot, and 
Robinson [1965, p. 122] point out, much of the criticism of the applications 
of Stevens’ theory of scale type has centered around measurements where 
the class of admissible transformations is not clearly defined. It seems 
likely that such criticism will continue until the scales used to measure 
loudness, brightness, IQ, etc., are put on a firmer measurement-theoretic 
foundation. 


2. Other writers have classified scale type a little differently than we do 
in this section. Coombs [1952] considered the four scales—nominal, ordi- 
nal, interval, and ratio—which are the four types of scales considered by 
Stevens [1951]. However, Coombs then added a partially ordered scale, 
falling between the nominal and the ordinal. He also obtained a more 
detailed classification of scales by asking first whether objects are just 
classified, partially ordered, or completely ordered, and then whether 
distances between objects are classified (large, small, etc.), partially 
ordered, or completely ordered. In all, this two-way classification led to 
eleven different scale types. Coombs, Raiffa, and Thrall [1954] made a 
similar distinction. Torgerson [1958] argued that a nominal scale is not 
really an example of measurement, since the numbers assigned do not 
reflect any real properties of the systems or objects being measured. He 
also distinguished between ordinal scales with natural origins and ordinal 
scales without natural origins. 


Exercises 


1. (a) If (A, B, f) is an ordinal scale with f(A) = Re, which of the 
following functions ¢: Re — Re are admissible transformations? 
(i) $(x) = e*. 
(li) o(x) = x + 7. 
(iit) $(x) = 801 x. 
(iv) (x) = x?. 
(v) $(x) = 8x4. 
(vi) $(x) = x? + 10. 
(b) Repeat for f an interval scale. 
(c) A ratio scale. 
(d) A nominal scale. 
(e) An absolute scale. 
(f) A difference scale. 
(g) A log-interval scale. 
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2. In football, the numbering of uniforms is not totally arbitrary. In 
some numbering schemes, offensive backs receive numbers lower than 50, 
ends receive numbers in the 80’s and 90’s, etc. Discuss when the number- 
ing of uniforms, plans, alternatives, etc., defines a nominal scale. 

3. Suppose A = {a,b,c}, R = {(a,b), (a,c), (b,c)}, and f(a) = 3, 
JS(b) = 2, f(c) = 1. Show that ((A, R), (Re, >), f) is a (regular) ordinal 
scale, but not an interval scale or a ratio scale. 

4. Let A = {a,b,c}, R = {(a, a), (5, 5), (c, ¢), (a, b), (a, 0), (b, c)}. 
Show that (A, R) is homomorphic to (Re, 2) and that every homomor- 
phism f defines a (regular) ordinal scale. 

5. Suppose A = {a, b,c}, R = {(a, a), (5, b), (c, c)}. Show that (A, R) 
is homomorphic to (Re, =) and every homomorphism f defines a (regular) 
nominal scale. 


6. Suppose A = {0, 1}, R = >, and © is defined by 
000=001=100=0, lol=1. 


Show that (A, R,0) is homomorphic to (Re, >, X) and every homomor- 
phism f defines a (regular) absolute scale. 

7. Suppose A = {r, s}, R = {(r, s)}, f(7) = 1, and f(s) = 0. Show that f 
is a (regular) homomorphism from (A, R) into (Re, >) which defines both 
an ordinal and an interval scale. 

8. Suppose 0 and f are defined as in Exer. 7 of Section 2.2. 


(a) Show that every admissible transformation ¢ of f is monotone 
nondecreasing, that is, it satisfies 


for all x, y in f(A). 

(b) Show that every monotone nondecreasing ¢ on f(A) is admissi- 
ble. 

(c) Show that statements (a) and (b) hold for any homomorphism 
from (Re, 0) into (Re, 9). 

9. Suppose (2, 6, f) defines a (regular) interval scale and g is another 

homomorphism from 2 into 6. Show that it follows that ratios of intervals 
are the same under f and g, that is, that 


f(a) — f(b) _ g(a) — g(b) 
fic) — f(d)—_g(c) — g(d) 


10. Suppose R is defined on Re by (2.17), g on Re by (2.18), and f on Re 
by f(x) = x. 
(a) Observe that g and f are homomorphisms from (Re, R) into 
(Re, R). [In the text, we considered these as homomorphisms from 
(Re, R, X) into (Re, R, X).] 
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(b) Observe that every admissible transformation of g is monotone 
increasing, but there are admissible transformations of f that are not. 
11. Prove Theorem 2.2 for 
(a) absolute scales; 
(b) ratio scales; 
(c) interval scales; 
(d) nominal scales. 
12. Is Theorem 2.2 true for 
(a) log-interval scales? 
(b) difference scales? 
13. (Hays [1973, pp. 133,134], etc.) Consider what scale types (if any) are 
most appropriate for the following data: 
(a) The nationality of an individual’s male parent. 
(b) Hand pressure as applied to a bulb or dynamometer. 
(c) Memory ability as measured by the number of words recalled 
from an initially memorized list. 
(d) Distance by air between New York and other cities. 
(e) U.S. Department of Agriculture classification of cuts of meat 
(“choice,” etc.). 
(f) The tensile strength or force required to break a wire. 
(g) The excellence of a baseball team as measured by the number of 
games won during a season. 
(h) Zip codes. 
(i) Commerce Department three-digit industry classification codes. 
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Given a theory of scale type, let us return to our definition of meaning- 
fulness and test it on the examples we stated at the beginning of Section 
2.2. We shall assume that all the scales in question come from regular 
representations, and so we may use our second definition of meaningful- 
ness: A statement involving numerical scales is meaningful if and only if 
its truth (or falsity) remains unchanged under all admissible transforma- 
tions of all the scales involved. 

Let us first consider the statement 


f(a) = 2f(d), (2.19) 


where f(a) is some quantity assigned to a, for example, its mass or its 
temperature. We ask under what circumstances this statement is meaning- 
ful. According to the definition, it is meaningful if and only if its truth 
value is preserved under all admissible transformations ¢, that is, if and 
only if, under all such ¢, 


f(a) = 2f(b) = (¢ o f)(a) = 2[(¢ 0 f)(d)]. 
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If $ is a similarity transformation, that is, if o(x) = ax, some a > 0, then 
we do indeed have 


f(a) = 2f(b) = af(a) = 2af(d). 


We conclude that the statement (2.19) is meaningful if the scale f is a ratio 
scale, as is the case in the measurement of mass. On the other hand, 
suppose it is only an interval scale, as is the case (usually) in the measure- 
ment of temperature. Then a typical admissible transformation has the 
form $¢(x) = ax + B, a > 0. Certainly we can find examples of interval 
scales f such that f(a) = 2f(b), but af(a) + B ¥ 2[af(b) + B]. In particu- 
lar, if we choose f(a) = 2 f(b) = 1, a = 1, and B = 1, this is the case. Thus, 
in general, the statement (2.19) is meaningless if f is an interval scale. We 
use the term “in general’? because we have only given an example of an 
interval scale for which (2.19) is meaningless. In the future, we shall drop 
this term, and call a statement given in terms of an abstract scale of a 
given type meaningless if it is meaningless for some example of a scale of 
the given type. In our case, it is easy to see that the statement (2.19) is 
meaningless for every interval scale. For whenever f(a) = 2f(b), taking 
a= B=1 gives us af(a) + B ¥ 2af(b) + B]. Conversely, whenever 
J(a) # 2f(b), taking a = 1 and B = f(a) — 2f(b) gives us af(a) + B= 
2[af(b) + B]. 

The above discussion explains why Statement III of Section 2.2 about 
one can of corn weighing twice as much as another makes sense, whereas 
Statement IV about one can having twice the temperature of a second does 
not, 

In the same way, one can explain why Statement I about the number of 
cans makes sense, whereas Statement II about the weight of a can does 
not. To see this, consider the statement 


f(a) 2 10. (2.20) 


If f is an absolute scale, as in the case of counting, we have for every 
admissible transformation 4, 


f(a) 2 10 = (go f)(a) 2 10, 


for the only admissible ¢ is the identity transformation. Notice that f(a) 
need not be greater than or equal to 10 for the statement f(a) 2 10 to be 
meaningful. Meaningfulness is different from truth; we simply want to 
know whether or not it makes sense to make the assertion. If f is a ratio 
scale, as in the case of weight, then the statement (2.20) is meaningless; for 
example, if f(a) 2 10 is true for some f, then taking a sufficiently small but 
positive makes af(a) 2 10 false. 
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We continue with several other examples. Suppose (2, 8, f) is a scale 
and a, b are in A, the underlying set of 2. Let us consider the statement 


f(a) + f(b) = 20. (2.21) 


Thus, (2.21) might be the statement that the sum of the weight of a and the 
weight of b is a constant, 20. Is this meaningful? The answer is no if fis a 
ratio scale. For if f(a) + f(b) = 20, then af(a) + af(b) = 20a ¥ 20 for 
a * 1. However, the statement that f(a) + f(b) is constant for all a, b in A 
is meaningful if f is a ratio scale. 

To give yet another example, consider the statement 


f(a) > f(b). (2.22) 
If o(x) = ax + B, a > 0, then 
af(a) + B > af(b) + B = af(a) > af(b) 
= f(a) > f(b). 


Thus, the statement (2.22) is meaningful if f is an interval scale. Indeed, it 
is meaningful if f is an ordinal scale. (Why?) Thus, for example, to say that 
the hardness of a is greater than the hardness of b is meaningful. 

Next, let us consider the statement 


f(a) — f(b) > fle) — fd). (2.23) 
If (x) = ax + B, a > 0, then 


[af(a) + B] —[af(b) + B] >[af(c) + B] —[af(d) + B] 


S 
f(a) — f(b) > fle) — f(a). 


Thus, the statement (2.23) is meaningful if f is an interval scale and of 
course if f is a ratio scale. It is not meaningful if f is an ordinal scale. To 
give an example, let A = {a, b,c, d}, let f(a) = 10, f(b) = 6, f(c) = 4, 
F(a) = 2, and let (10) = 11, $(6) = 9, $(4) = 7, $(2) = 3. Then 


f(a) — f(b) > fle) — f(a). 


Moreover, @ is monotone increasing on f(A) = {10, 6, 4, 2}. But 


(¢ 0 f)(a) — (po f)(b) = 11-9 =2, 


which is less than 


(p of(c) — (@of)\(d4)=7-3=4. 
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The reader might wish to consider whether (2.23) could ever be meaningful 
for f an ordinal scale. We conclude from our analysis that it is meaningful, 
for example, to compare temperature differences, that is, to say that the 
difference in temperature between a and b is greater than the difference in 
temperature between c and d. However, a similar comparison of dif- 
ferences in hardness is not meaningful. 

In general, ordinal scales can be used to make comparisons of size, like 


f(a) > f(b), 


interval scales to make comparisons of difference, like 


f(a) — f(b) > fle) — f(4), 


and ratio scales to make more quantitative comparisons such as 


f(a) = 2f(6) 


and 


f(a) /f(b) =. 


Continuing with examples, let us consider two scales, (2,8, f) and 
(’, B’, g), where 2 and 2’ have the same underlying set, and let us 
consider the statement 


f(a) + g(a) is constant. (2.24) 


This might be the statement that a certain gas’s temperature plus its 
pressure is constant. If f and g are both ratio scales, then to be meaningful, 
the truth or falsity of (2.24) should be unchanged under (possibly different) 
admissible transformations of each scale. That is, if ¢(x) = ax, a > 0, and 
(x) = Bx, B > 0, then (2.24) should hold if and only if 


af(a) + Bg(a) is constant. (2.25) 


But (2.25) might very well not be true even if (2.24) is. For if we have 
F(a) = — g(a) for all a, then f(a) + g(a) = 0, all a; but if a + B and f(a) is 
not constant, then af(a) + Bg(a) = (a — B)f(a) is not constant. Thus, 
(2.24) is not meaningful if f and g are ratio scales. On the other hand, 
certainly (2.24) is meaningful if f and g are both absolute scales. 

We shall mention more complex examples of meaningful and meaning- 
less statements in the next two sections. For now, let us point out one 
complication. 

It is possible that in statements like (2.24), the scale g is defined in terms 
of the scale f. Then, we might not want to allow all possible admissible 
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transformations of both f and g independently, but only admissible trans- 
formations of f and the “induced” admissible transformations of g. This 
situation will arise in the next section, when we discuss derived measure- 
ment, where one scale is defined in terms of another. In derived measure- 
ment, there will be several versions of meaningfulness, narrow and wide, 
depending on whether or not we pick admissible tranformations of all 
scales independently. 


Exercises 


1. (a) Suppose f is a ratio scale. Which of the following statements are 
meaningful? 
(i) fla) + f(b) > fo). 
(ii) f(a) = f(d). 
(iii) f(a) = 1.8f(6). 
(iv) f(a) + f(b) is constant for all a, b in A. 
(v) fla)f(b) > fcc)’. 
(i) fla) — f(b) > 2c) — fa)). 
(vii) fla) > VOL) — f(0)] 
(vili) f(a) + f(b) > fic)’. 
(b) Repeat for f an ordinal scale. 
(c) An interval scale. 
(d) A nominal scale. 
(e) An absolute scale. 
(f) A difference scale. 
(g) A log-interval scale. 


2. (a) Show that if f and g are (independent) ratio scales, the statement 
f(a)g(a) is constant for all ain A 


is meaningful. 
(b) Is this statement meaningful if f and g are interval scales? 
(c) What if f is an interval scale and g is a ratio scale? 
(d) What if f is an interval scale and g is an absolute scale? 
3. (a) Suppose f is an interval scale and g is an absolute scale. Show 
that neither of the following statements is meaningful: 
(i) logiol f(a)| = 2 logicl f()I. 
Gi) f(a) + g(a) is constant for all a in A. 
(b) Consider the meaningfulness of the statements in part (a) if fis a 
ratio scale and g is an absolute scale. 
(c) Repeat for f and g both ratio scales. 
4. Is there any ordinal scale f for which the statement (2.23) is 
meaningful? 
5. Are there any ratio scales f and g for which the statement (2.24) is 
meaningful? [What if (2.25) is false?] 
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6. If the representation 2[-—»% is irregular, then there is always a 
meaningless statement which would be judged meaningful under our 
second definition, namely, the one in terms of admissible transformations. 
For let (21, 8, f) be an irregular scale. Then by Theorem 2.1, there are a, b 
in A such that f(a) = f(b) and g(a) #2(b). Show that the statement 
(a) = f(b) is meaningful under the second definition, but it is meaningless. 


7. Consider the relational system (A, R) of Eq. (2.11), and the homo- 
morphism g of Eq. (2.12) from (A, R) into (Re, >,), where >, is defined 
in Eq. (2.10). 

(a) Show that [(A4, R), (Re, >), g] defines an irregular scale. 
(b) Comment on the meaningfulness of the assertion g(s) > g(¢). 
(c) Comment on the meaningfulness of the assertion g(r) > g(¢). 

8. Consider the relational system (A, R) of Exer. 6, Section 2.2, and the 
homomorphism g from (A, R) into (Re, A) given there. Comment on the 
meaningfulness of the following assertions: 

(a) g(s) > g(r). 

(b) g(2) > g(r). 

(c) a(t) ¥a(r). 

9. Comment on the meaningfulness of the following statements: 

(a) This shelf is three times as long as that one. 

(b) A patient’s height is twice his weight. 

(c) Glass is twice as hard as paper. 

(d) The school day in England is one and one-half times as long as 
that in the United States. 

(e) A can’s weight is greater than its height. 

(f) A circle’s area is three times that of a second circle. 

(g) The wind yesterday was calmer than it is today (the Beaufort 
wind scale classifies winds as calm, light air, light breeze, ... ). 

(h) This rat weighs more than any of the others. 

(i) This rat Weighs more than those two combined. 


2.5 Derived Measurement 


Very often we are given certain numerical scales or assignments and we 
want to introduce new scales defined in terms of the old ones. Most scales 
in the physical sciences have this property. A typical one is density d, 
which can be defined in terms of mass m and volume V as d = m/V. If 
density is simply defined from mass and volume, then density is not 
measured fundamentally in the sense we have been describing, but rather it 
is derived from other scales, which may or may not be fundamental. In this 
section, we shall discuss the process of obtaining derived scales. 

We do not present an elaborate formal representation theory of derived 
measurement. Indeed, there is no generally accepted theory. The approach 
to derived measurement which we shall present is based on that of Suppes 
and Zinnes [1963]. This is to be contrasted with the approaches of 
Campbell [1920, 1928] and Ellis [1963], who emphasize “dimensional” 
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parameters (see below for an example), and of Causey [1967, 1969], who 
deals with “classes of similar systems.” Some writers, for example Pfanzagl 
[1968, p. 31], argue that derived measurement is not measurement at all. 
Pfanzagl argues that if a property measured by a numerical scale had any 
empirical meaning of its own, then there would also be a fundamental 
scale. The defining relation then simply becomes an empirical law between 
fundamental scales. 

Certainly it is true that derived measurement is often a relative matter. 
The same scale can be developed as either fundamental or derived. 
Usually, some basic scales are chosen and others are derived from them. In 
this sense, it is possible to define density using fundamental measurement; 
see the discussion in Section 5.4, Exer. 26. However, it is not always so 
easy or so natural to treat derived scales as fundamental. Besides, derived 
measurement corresponds to a process which is frequently used in practice. 
Hence, it is important to have a theory for this kind of measurement. 

To have in mind a firm idea of what we mean by derived measurement, 
let us suppose that A is a set and fi, f,,...,f, are given real-valued 
functions on A. We call these functions primitive scales, and define a new 
real-valued function g on A in terms of these primitive scales. The function 
g is called the derived scale. This definition is very broad. Indeed, Causey 
[1969] argues that it is deficient because the derived scale is not required to 
reflect in any direct manner the characteristics of empirical relational 
systems. As Adams [1966] points out, weight times volume and weight plus 
volume are equally good derived measures according to the definition, 
regardless of empirical significance. In spite of these objections, we feel 
that this broad notion of derived scale leads to some empirically useful 
results, and we shall adopt it. 

The definition of the derived scale g in terms of the f, need not be in 
terms of an equation. A simple example to illustrate this last point is the 
following contrived example. Suppose u:A — Re is an ordinal utility 
function, that is, a function satisfying Eq. (2.3). Let » be a function on A 
with the property that 


u(a) >u(b) iff v(a) < v(d). 


Then wv is a derived scale. A representation theorem will state (necessary 
and) sufficient conditions for the existence of a function satisfying the 
definition. (In this case, the function v always exists. However, the func- 
tion v is not unique.) We present a less contrived example in Section 6.2.2. 
In general, the derived scale g and the primitive scales f,, f2,...,f, will 
satisfy a certain condition C(/,, 4, ... f,, g), and any function g satisfying 
this condition will be acceptable. Condition C may of course be an 
equation relating g to f\, f,..-,f,- In the case of density, C(m, V, d) 
holds if and only if d = m/V. 
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A more important problem for derived measurement is the uniqueness 
problem. There are two different senses of uniqueness, depending on 
whether or not we allow the primitive scales f,, f.,...,f, to vary. For 
example, in the case of density, if m and V are not allowed to vary, then d 
is defined uniquely in terms of m and V. However, both m and V are ratio 
scales. If we allow m and V to be replaced by other allowable scales, then d 
can vary. If m’ and V’ are other allowable scales, then there are positive 
numbers a and B such that m’(a) = am(a) and V’(a) = BV(a), for all a in 
A, the set of objects being measured. The corresponding derived scale of 
density is given by 

(q) = Ma) _ ama) _ a 
a) = Va) ~ BMa) ~ BA 


Thus, a’ is related to d by a similarity transformation. 

If a derived scale g is defined from primitive scales f;, f,,...,f, by 
condition C, we say that a function $:g(A) > Re is admissible in the narrow 
sense if g’ = 6 O g satisfies 


Chis fe eee Pf g’). 


We say ¢ is admissible in the wide sense if there are acceptable replacement 
scales f,’, f.’,...,f, for fi, fy ..-.,5,, respectively, so that 


Chi fy; see oth’ g’). 


(In particular, if each f, is a regular scale, the f;’ would be defined by taking 
appropriate admissible transformations of the f.) In the case of density, the 
identity is the only admissible transformation in the narrow sense. We 
have shown that every admissible transformation in the wide sense is a 
similarity transformation. Conversely, every similarity transformation is 
admissible in the wide sense. For suppose d’ = ad,a > 0. Then 
C(am, V, d’) holds, so a’ is obtained from d by an admissible transforma- 
tion in the wide sense. 

We say that the derived scale g is regular in the narrow sense if, whenever 
CUS. fas - -» » fy. 8) holds, then there is a transformation $:g(A) > Re such 
that g’ = ¢ 0g. The transformation ¢ is an admissible transformation m 
the narrow sense. We say g is regular in the wide sense if, whenever 
Si’, fo’, ...,f,’ are acceptable replacement scales for scales f,, f,,...,f,5 
and whenever C(f,’, f.’,..-.,J,’, 8) holds, then there is a transformation 
:g(A) — Re such that g’ = ¢ Og. Here, ¢ is an admissible transformation 
in the wide sense. Thus, density is a regular scale in both the narrow and 
wide senses. 

For regular scales, scale type can be defined analogously to the defini- 
tion in fundamental measurement, except that there are narrow and wide 
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senses of scale type. For example, if g is a regular scale in the narrow 
(wide) sense, then we say it is a ratio scale in the narrow (wide) sense if the 
class of admissible transformations in the narrow (wide) sense is exactly 
the class of similarity transformations. Thus, density is an absolute scale in 
the narrow sense, since the only admissible transformation in the narrow 
sense is the identity. However, density is a ratio scale in the wide sense.* 

We shall adopt the same theory of meaningfulness as for fundamental 
measurement. Thus, for example, even in the wide sense, it makes sense to 
assert that one medium is twice as dense as another. We shall usually want 
statements to be meaningful in this wide sense. 

In Chapter 4, we shall apply the theory of uniqueness in derived 
measurement to psychophysical scaling. In the next section, we shall apply 
this theory to energy use, air pollution, and the consumer price index. 


Remark: Luce [1959, 1962] and Rozeboom [1962a, b] point out that 
certain parameters can enter into relationships C(/,, f,,...,f,,g) in a 
manner different from that in either the narrow or wide senses as defined 
above. For example, sometimes fj, f,, ...,J, represent different measure- 
ments using the same scale. Then it may only be reasonable to use the 
same admissible transformation of each of the f, in determining the 
wide-scale type of g. We shall encounter an example of this in the next 
section. More complicated situations arise. For example, in the law of 
decay for a radioactive material, we have 


> -0.14 
q=qe °°"; 


where q is the quantity of material at time ¢ measured in seconds, and qp is 
the initial quantity. If time is measured in hours, the law changes to 


om - 50.41 
qd = Ue ’ 


since 50.4 = (360) x (0.14). Thus, the law of radioactive decay could be 
restated as 


g= qe", 


where k is a parameter. We have 
C(t, k,q) iff ¢g = qoe7™. 


However, transformations of k are determined by transformations of 1¢, 
and not by any measurement theory for k. Luce calls k a “dimensional 
parameter.” We may not take independent transformations of k as well as 


*Note that the proof of this required two steps: proof that every admissible transformation 
in the wide sense is a similarity, and proof that every similarity is an admissible transforma- 
tion in the wide sense. 
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independent transformations of t. Thus, the discussion above must be 
modified. What is true is that transformations of t lead to transformations 
of k such that kt is a constant. This relationship is implicit in the statement 
of the condition C(t, k, g). Thus, in a wide sense, ¢/gqp is an absolute scale. 
In applying the theory of uniqueness to derived measurement, the reader 
should be careful to modify the theory when needed to consider these sorts 
of complications. 


Exercises 


1. The inefficiency J of a particular fossil fuel power plant might be 
measured as J = tons of emissions per ton of fuel burned. Show the 
following: 

(a) J is regular in both the narrow and wide senses. 

(b) 7 is an absolute scale in the narrow sense and a ratio scale in the 
wide sense. 

(c) Even in the wide sense, it makes sense to assert that plant a is 
twice as inefficient as plant b or that the inefficiency of plant a plus the 
inefficiency of plant 6 is greater than the inefficiency of plant c. 


2. Suppose f(a) is the height of a and g(a) the weight of a. Let 
n(a) = £-+ (0 


and 
k(a) =V fla)g(a) . 


Then / and k are derived scales. Show the following: 
(a) Both A and &k are (regular) absolute scales in the narrow sense. 
(b) The scale k is a (regular) ratio scale in the wide sense, but / is not 
necessarily a (regular) ratio scale in the wide sense. 


3. Suppose u(a) is the utility of a, measured on an interval scale, and 
m(a) is the dollar value of a, measured on a ratio scale. Then the utility per 
dollar is measured as D(a) = u(a)/m(a). Show that there are admissible 
transformations of D(a) in the wide sense that are not positive linear 
transformations. 

4. The physical scales work, momentum, etc., are derived scales. Con- 
sider what types of scales they are. 

5. (a) Suppose f, and f, are both (regular) ratio scales. Show that 
g = ff, is a (regular) ratio scale in the wide sense. 

(b) If f, and f, are both (regular) interval scales, is g = f, f, neces- 
sarily an interval scale in the wide sense? 


6. Repeat Exer. 5 for g = f, + f,. 


2.6 Some Applications of the Theory of Meaningfulness 81 


2.6 Some Applications of the Theory of Meaningfulness: 
Energy Use, Air Pollution, and the Consumer Price Index 


2.6.1 Energy Use; Arithmetic and Geometric Means 


In this section, we give several more complicated applications of the 
theory of meaningfulness, to both fundamental and derived scales. We 
shall assume that all scales in question come from regular representations 
or are regular in both the narrow and wide senses. 

Let us first imagine that we study 7 animals under one kind of experi- 
mental treatment (say a special diet) and m animals under a second kind 
of treatment. We want to say that the average weight of the animals under 
the first kind of treatment is larger than the average weight of the animals 
under the second treatment. We treat weight as a fundamental scale. 
Specifically, if f is the scale of weight in question, we want to consider the 
statement 


1 n 1] ™ 
ne fa) > 5 2 fe. (2.26) 


Here, we calculate arithmetic means over two different sets, and compare 
them. We consider the statement (2.26) meaningful if for all admissible 
transformations ¢, (2.26) holds if and only if 


1 2 i 2 

= 2,0 f(a) >= 2 of\(d). (2.27) 
If ¢ is a similarity transformation, say ¢(x) = ax, a > 0, then certainly 
(2.27) holds if and only if (2.26) holds. This is even the case when ¢ is a 


positive linear transformation, that is, ¢(x) = ax + B, a > 0. For then 
(2.27) becomes 


~ 3 [afla) + B] > E [af(d) + 8} 


which reduces to (2.26). Thus, (2.26) is meaningful if f is a ratio scale or an 
interval scale. If f is an ordinal scale, then (2.26) is meaningless. Proof is 
left to the reader. This result can be applied to say that the statement 
“group A has higher average IQ than group B” is meaningless if IQ is only 
an ordinal scale. (Raw scores on intelligence tests define ordinal scales, 
according to Stevens [1959].) However, the statement is meaningful if IQ is 
an interval scale or a ratio scale. (It has been argued that “standard scores” 
on an imtelligence test define an interval scale—see Stevens [1959].) All too 
often in the social sciences, comparisons of arithmetic means (as well as 
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other comparisons) are made with little attention paid to whether or not 
these comparisons are meaningful. 

Suppose next that we have several experts or individuals and each rates 
two alternatives, a and b. Let f(a) be the rating of a by expert i, and f(b) 
be the rating of b by expert i. We might want to consider the statement 


ie ieee 
5 2 fla) >= 5, f(b). (2.28) 


Here, we say that the average rating of alternative a is higher than the 
average rating of alternative b. If we think of each f; as a fundamental 
scale, then the statement (2.28) may be thought of as a statement involving 
several fundamental scales or a statement involving one derived scale, the 
average +> f,. We use the former interpretation, though the latter interpre- 
tation gives the same results (using the wide sense of scale). Even if each f, 
is a ratio scale, statement (2.28) is now meaningless. For, we must consider 
simultaneously transformations $x) = a,x of each f,, and certainly there 
are a; such that (2.28) holds, while 


| Gare {<2 
5 Beha) > = E, aifi(b) (2.29) 


does not. On the other hand, comparison of geometric means* over 
individuals is meaningful, for 


Vii >VaKe) + Vtasa >Vias0) - 


Another way to phrase this conclusion is that if g = VII,f, is thought of 
as a derived scale, then it forms a ratio scale in the wide sense, and so 
g(a) > g(b) is a meaningful statement (in the wide sense). 

We shall briefly mention an application of this last example. As part of a 
larger study (Roberts [1972, 1973]), a set of variables relevant to the 
growing demand for energy was presented to a panel of experts, who were 
asked to judge their relative importance using the method of magnitude 
estimation. In this method, the expert first selects that variable which 
seems most important and assigns it the rating 100. Then he rates the other 
variables in terms of the most important one, so that a variable receiving a 
rating of 50 is considered “half as important” as one receiving a rating of 
100, etc. A typical set of variables and ratings of one of the experts is 
shown in Table 2.3. It seems plausible that the magnitude estimation 


n 
*The geometric mean of a collection of numbers x, x2, ..., x, is VII,x,, where IIx; 
means the product x,°x2- +: x,. 
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Table 2.3. Magnitude Estimation by One Expert of Relative Importance for Energy 
Demand of Variables Related to Commuter Bus Transportation in a Given Region * 


Variable Relative Importance Rating 


1. Number of passenger miles (annually, by bus) 80 
2. Number of trips (annually) 100 
3. Number of miles of bus routes 50 
4. Number of miles of special bus lanes 50 
5. Average time home to office (or office to home, or sum) 70 
6. Average distance home to office 65 
7. Average speed 10 
8. Average number passengers per bus 20 
9. Distance to bus stop from home (or office, or sum) 50 
10. Number of buses in the region 20 
11. Number of stops (home to office, or vice versa, or sum) 20 


*Adapted from Roberts [1972] with permission of the RAND Corporation. 


procedure leads to a ratio scale—this is presumed by Stevens [1957, 1968)].* 
Thus, comparisons of geometric mean relative importance ratings (over 
experts) are probably meaningful, while comparisons of arithmetic means 
are probably not. This observation led to the use of geometric means. (See 
Roberts [1979] for a related application of the theory of meaningfulness.) 


Remark: The analysis of this example must be done with some care. 
Since the scale value of the most important element was fixed to be the 
same for all the experts, the scales are not really independent, and so it is 
probably not reasonable to demand that the truth value of a statement be 
preserved under different transformations of all the scales, but rather only 
under the same transformation of each scale. But now it probably becomes 
meaningful to speak of comparisons of arithmetic means. For if we take 
a; = a, all i, then (2.28) holds if and only if (2.29) holds. This reasoning 
points up the fact that the theory of meaningfulness has to be applied with 
care. 


Remark: A more detailed discussion of measurement theory and the use 
of statistical summaries such as means and of statistical tests can be found 
in Pfanzagl [1968]. For a discussion of many particular statistics, see 
Adams, Fagot, and Robinson [1965]. There is a need for a more systematic 
development, and such a development would be of widespread practical 
significance, if its results were made widely known. 

There have been views expressed in the measurement literature which 
differ from the point of view taken here, and assert that the choice of 


*We have remarked earlier that where there is no obvious representation, admissible 
transformations must be defined as functions preserving empirical information depicted by a 
scale, and hence scale type is not formally defined, but depends on our interpretation of what 
the empirical information content of the scale is. 
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statistic (arithmetic mean, geometric mean, etc.) or of statistical test (1-test, 
etc.) does not depend on the type of scale involved. For example, Ander- 
son [1961] argues that statistical tests concern numbers, and it does not 
matter what kind of scale these numbers come from. For a survey of such 
arguments, and a criticism of them, see Adams, Fagot, and Robinson 
[1965] or Stevens [1968]. Hays [1973, Section 3.2] also discusses this issue. 
Luce [1967] gives references to some of the literature on this issue, and 
argues that the use of a statistical test is imited by the class of transforma- 
tions under which a null hypothesis is unchanged. Adams, Fagot, and 
Robinson [1965] argue that applying any sort of statistic to numbers is 
always appropriate. However, statements made using this statistic might be 
inappropriate if they are meaningless. Finally, some authors have for- 
malized the risk involved in applying a statistical test (for example, 
comparison of arithmetic means) in a situation (for example, an ordinal 
scale) when such a test is inappropriate. See Abelson and Tukey [1959, 
1963] for an example of such an attempt. 


2.6.2 Consumer Price Index 


Turning to another example, let us consider the computation of the 
consumer price index.* This index relates current prices of certain basic 
commodities, including food, clothing, fuel, etc., to prices at some refer- 
ence time. Suppose the index will be based on n fixed commodities. 
Suppose p,(0) is the price of commodity i at the reference or base time, and 
P(t) is the price of commodity i at time t. Then one consumer price index, 
due to Bradstreet and Dutot (see Fisher [1923, p. 40), is given by 


1(p\(0), Pat), ---s PD) = 3, vA)/ ES p(0). 2.30) 


This is an example of a derived scale: J is defined in terms of 
Py} Pz » + + » Py» Now each price is measured on a ratio scale—the admissi- 
ble transformations (in the wide sense) are conversions from dollars to 
cents, cents to francs, etc. But the scales are independent, so an admissible 
transformation of J in the wide sense results from independent choice of 
positive numbers a), a2, ..., a. But now even the statement 


1(Py(1), Pat), - - +» Pn(t)) > 1(Pi(s), Po(s),- +++ Pn(s)) (2.31) 


is meaningless in the wide sense. For it is quite possible to choose the a; so 
that 


pit) yf =, Pi(0) > 2 ps) / = pO), (2.32) 


*Our discussion in part follows Pfanzag} [1968, p. 49] who references Fisher [1923, p. 40]. 
The discussion easily generalizes to other indices, such as of productivity and consumer 
confidence. See Eichhorn [1978] for some recent discussion, and a variety of references, 
especially on page 158. See also Allen [1975] and Samuelson and Swamy [1974]. 
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while 
Earl / 3 ap,0) < 3 apis)/ 3 up(0). (233) 


For example, suppose n = 2, p,(0) = p,(0) = 1, p\(O = 5, p(t) = 10, 
P\(s) = 6, p(s) = 8, a, = 2, a, = 4. Then (2.32) and (2.33) hold, since 


5+10. 6+8 
1+1 1+1 
and 
10+ 5 12+ 4 
< 


If we insist that all prices be measured in the same units, which is 
reasonable, then various comparisons of the index / are meaningful, even 
in the wide sense. For an admissible transformation of J in the wide sense 
results from multiplication of each p,{‘) and p,(0) by the same positive 
number a. Then 


Eap(0) / Zap,0) = E70) / Zp,00), 
so it is meaningful to assert that 


I(p,(t), P(t), -- +» Pa(t)) > 1(py(s), po(s), - - Pals). 


It is now even meaningful to assert that the index has doubled or increased 
by 20% between time s and time ¢, for the following statements are now 
easily seen to be meaningful: 


I(p,(¢), P22), ee »Pp(t)) oe 21(p,(s), p(s), sere » Pal), 
I(p,(2), P(t), mee » Pp(t)) = 1.2](p,(s), p(s), ee »Pa(S))- 


A consumer price index for which comparisons of size are meaningful 
even if different prices may be measured in different units is 


n n I/n 
He(rPAds---+P()) = (Ted / Hp) "+ (234) 


*Index J is discussed in Boot and Cox [1970, p. 499]. 


86 Fundamental Measurement, Derived Measurement, and the Uniqueness Problem 2.6 


For 
l/n 


(0 a;p,(t) / pi api(0)) = (829 / 1 p(0) 


Indeed, with the index J of Eq. (2.34), it again is meaningful to say that the 
index doubled between time s and time ¢ or that it increased by 20%. For 
the statements 


J(Pi(t), P2(t), « - +» Palt)) = 2I(pi(s), pls), - - +» Pals) 
and 
J( p(t), Pot), « - +» Pa(t)) = 1.2I(pi(s), pos), » - > Pals) 


are meaningful in the wide sense, even allowing independent changes of 
scale for different prices. The reader should note that simply to say that 
these comparisons or statements are meaningful in our technical sense 
does not say they are meaningful in an economic sense. It is not clear 
exactly what is the economic content of a consumer price index as 
measured by index J. Meaningfulness in our technical sense is necessary 
but not sufficient for meaningfulness in practical terms. 

Some economic arguments can be raised against the index J of Eq. (2.30) 
as well. In actual practice, the consumer price index is measured by the 
Bureau of Labor Statistics by taking a ratio of weighted sums, 


K(pi(2), Pals «= Pa) = EAPC) / ZAP). (2.38) 


(See Samuelson [1961, p. 135], Lapin [1973], Boot and Cox [1970], Rothwell 
[1964], or Mudgett [1951].) The weight A; measures the quantity of item i 
in an “average market basket,” a weighting factor disregarded in index J. 
The weighting factors are obtained from surveys of spending patterns. 
These weighting factors differ over time, of course. (See Table 2.4 for an 
example of different spending patterns.) But one set of weighting factors is 
fixed over all tine in calculating the index K. (See Exers. 5 and 6.) 

In terms of the consumer price index K, if weights  , are assumed fixed, 
the following statements are meaningless in the wide sense if the prices are 
allowed to be measured in different units, but meaningful otherwise: 


K(p,(t), po(t), - - - »Pa(t)) > K(p,(s), pols), - «> Pals), 
K(p,(2), p2(t), - - > Pa(t)) = 2K(p\(s), p(s), - - «> Pas), 
K(p,(4), po(t), -- -  Palt)) = 1.2K(p,(s), po(s), -. +» Pa(s)). 


We return to the consumer price index, and index numbers in general, in 
the exercises below and in Exer. 15, Sec. 4.2. 
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Table 2.4. Percentage of Consumer Spending for Various Years of Urban 
Wage Earners and Clerical Worker Families, by Broad Categories* 


1917-1919 1934-1936 1952 1963 
Food 40 33 30 22 
Housing 27 32 33 33 
Apparel 18 11 9 11 
Transportation 3 8 11 14 
Other goods 12 16 17 20 


and services 


*Source: Boot and Cox [1970, p. 493}. 


2.6.3 Measurement of Air Pollution 


For our last application, we turn to measurement of air pollution. There 
are various pollutants present in the air, which have quite different effects. 
As far as the damaging health effects of pollution are concerned, the most 
important pollutants seem to be carbon monoxide (CO), hydrocarbons 
(HC), nitrogen oxides (NOX), sulfur oxides (SOX), and particulate matter 
(dust, etc.) (PM). Also damaging are the products of chemical reactions 
among these pollutants, the most serious ones being the oxidants (such as 
ozone) produced by hydrocarbons and nitrogen oxides reacting in the 
presence of sunlight. Finally, some of these pollutants are more serious in 
the presence of others; for example, sulfur oxides are more harmful in the 
presence of particulate matter (National Air Pollution Control 
Administration [1969]). We say that these are synergistic effects. We shall 
disregard chemical reactions among pollutants and synergistic effects. 

To be able to compare alternative pollution control policies, one should 
be able to compare the effects of different pollutants. Indeed, some policies 
might result in net increases in the emissions of some (it is hoped less 
harmful) pollutants while achieving cutbacks in emissions of other pollu- 
tants. How can two such strategies be compared? One proposal given in 
the literature is that some form of combined pollution index be used, 
which would give one number indicating how bad the air pollution is, 
based on the levels of emissions of the different pollutants. There are other 
advantages for having such a single index. For one, daily or yearly 
pollution forecasts and reports could be more easily given, and progress 
could be easily measured. 

A simple way of producing a single, combined pollution index is to 
measure the total weight of emissions of each pollutant i over a fixed 
period of time (one hour, one day, one year, or whatever the time period in 
question is), and then to sum up these numbers. Let e(i, t, k) be the total 
weight of emissions of pollutant i (per cubic meter) over the ¢th time 
period and due to the kth source or measured in the kth location. Then the 
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simple pollution index we have just described is given by 


A(t, k) = Zeli, t, k). (2.36) 


This is again an example of derived measurement. Use of the derived index 
A(t, k) leads to the conclusion that transportation is the largest source of 
air pollution, accounting for over 50% of all pollution, and that stationary 
fuel combustion (especially by electric power plants) is second largest; use 
of the numbers e(i, ¢, k) leads to the conclusion that carbon monoxide 
accounts for over half of all emitted air pollution (Walther, 1972). 

These conclusions are meaningful (in the wide sense). For statements 
like the following are meaningful if we measure all e(r, ¢, k) in the same 
units of mass (a unit often used in the air pollution literature is wg /m’): 


A(t, k) > A(t, k’), 
A(t, k,) > Ae k), 


Zeli,t,k)> = = ely, t,k). 
ik tk jxi 


These statements are meaningful because an admissible transformation in 
the wide sense amounts to multiplying each e(i, t, k) by the same positive 
number a, and the comparison 


ZDe(i, t,k) > Teli, t, k’) 
i t 


holds if and only if 
ZDae(i, t,k) > Daeli, t, k’), 


the comparison 


Ze(i,t,k,) > = Teli, t,k) 
i kek, i 


holds if and only if 
Dae(i,t,k,) > =Z LTaeli, t,k), 
i k#k, i 


and the comparison 
= e(i,t,k) > = = ely, t,k) 
ik 1k j#i 

holds if and only if 


= ae(i,t,k) > = SX aely, t, k). 
ik i,k j¥i 
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Although such comparisons using the numbers e(i, t, k) and A(t, k) are 
meaningful in the technical sense, there is some question about whether 
they are meaningful comparisons of the pollution level in a practical sense. 
A unit of mass of carbon monoxide is far less harmful than a unit of mass 
of nitrogen oxide. For example, 1971 U.S. Environmental Protection 
Agency (EPA) ambient air quality standards, based on health effects and 
corrected for a 24-hour period, allowed 7800 units of carbon monoxide as 
compared to 330 units of nitrogen oxides, 788 of hydrocarbons, 266 of 
sulfur oxides, and 150 of particulate matter. (For these EPA standards, see 
Environmental Protection Agency, 1971; for corrections to 24 hours, see 
Babcock and Nagda, 1973). Babcock and Nagda call these numbers 
tolerance factors. They are also called Minimum Acute Toxicity Effluent 
(MATE) criteria. (See, for example, Hangebrauck [1977], Industrial En- 
vironmental Research Laboratory [1976] or Schalit and Wolfe [1978].) The 
tolerance factors are levels above which adverse effects are known or 
thought to occur. Let ¢(i) be the tolerance factor for the ith pollutant. The 
severity factor or effect factor is 1/t(i) or t((CO)/t(i), the ratio of tolerance 
factor for CO to that for i.* Babcock and Nagda and others (Babcock 
[1970], Walther [1972], Caretto and Sawyer [1972]) suggest weighting the 
emission levels (in mass) by the severity factor and obtaining a combined 
pollution index by using a weighted sum. This amounts to using the indices 


1 


W e(i, t, k) (2.37) 
and 
B(t, k) = 5 ao eb (2.38) 
or 
oe e(i, t, k) 
and 
BLK = ED eli t, k). (2.39) 


*The severity factors appearing in the literature differ from reference to reference, for 
various reasons. First, federal air quality standards are not all laid out for the same time 
period. Rather, some are for one hour, some for eight hours, etc. There is a difference of 
opinion about how to extrapolate these standards to the same time period, for example 24 
hours. There are also differing approaches to bringing in chemical reactions and synergistic 
effects. 
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The index (2.37) is sometimes called degree of hazard (see, for example, 
Industrial Environmental Research Laboratory [1976] and Schalit and 
Wolfe [1978]). The combined pollution index (2.38) (called pindex by 
Babcock [1970]) is designed to measure the harmful effect of pollution. 
Under this index, transportation still is the largest source of pollutants by 
effect, but now accounting for less than 50%. Stationary sources fall to 
fourth place on the list. Use of the weighted factors [1/7(i)]e(i, t, k) drops 
carbon monoxide to the bottoin of the list of pollutants by effect, with 
[1 /«(CO)]e(CO, #, k) just over 2% of the total— it was over 50% of the 
total by mass. (See Walther [1972] for a discussion of these results, and 
Babcock and Nagda [1973] for a discussion of implications of the results 
for air pollution control strategies.) A similar analysis could be applied to a 
particular factory or power plant which puts out a variety of different 
pollutants in its effluent stream. For the procedure see, for example, 
Hangebrauck [1977], Industrial Environmental Research Laboratory [1976] 
or Schalit and Wolfe [1978]. 

These results are meaningful in our technical sense. For, again assuming 
that all emission weights are measured in the same units, an admissible 
transformation in the wide sense now amounts to multiplication of each 
e(i, t, k) and each ¢(#) by the same number a. Since 


Le ee ree ee 


at(i) t(i) 
the statements 


B(t, k) > B(t, k’), 
Bit,k,) > = B(t, k), 
k+#k, 


and 


> e(i, t, k) = (.02) = Lely, t, k) 
t,k tiki 


are meaningful. Similar statements are also meaningful if we use 
[e(CO)/7()Je(i, t, k) and B’(t, k) of Eq. (2.39). 

The measure B(?, k) of Eq. (2.38) amounts to the following. For a given 
pollutant, take the percentage of a given harmful level of emissions that is 
reached in a given period of time, and add up these percentages over all 
pollutants. The resulting number is a measure of total air pollution. (It can, 
of course, come out larger than 100%.) This is the procedure that was 
introduced for use in the San Francisco Bay Area in the 1960’s (see Bay 
Area Pollution Control District [1968] or Sauter and Chilton [1970]). There 
are some serious problems with this measure. First, if 100% of the carbon 
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monoxide tolerance level is attained, this is known to have some damaging 
effects. The measure implies that the effects are equally severe if the levels 
of all five major pollutants are relatively low, say 20% of their known 
harmful levels. Similarly, the measure assumes that reaching 95% of the 
tolerance level for carbon monoxide and 5% of the level for one other 
pollutant is as bad as reaching 100% of the tolerance level for carbon 
monoxide. Is this really the case? It seems unlikely. Thus, once again, the 
comparisons made using the combined pollution index are meaningful in 
our technical sense, but there is some doubt as to whether or not they have 
real meaning. 


Exercises 


1. If geometric means are used in place of arithmetic means in Eq. 
(2.26), and f is weight, is the comparison still meaningful? 


2. (a) In the situation where there are different experts, and each f, is 
an ordinal scale, show that comparison of arithmetic means is meaningless. 

(b) What about comparison of geometric means? 

(c) Show that comparison of medians is meaningful if there is an 
odd number of experts and we only allow the same admissible transforma- 
tion of each f,. 

(d) What if we allow different transformations? 


3. In Section 8.3, we shall consider direct estimates of subjective 
probability by various experts. If p,(A) is the estimate by expert i of the 
subjective probability of event A, consider the meaningfulness of the 
statement 


L 2 L2 
rs = PAA) > rs (= PAB). 


(It is not clear what kind of scale direct estimates of subjective probability 
define. In Section 8.3, we shall argue that they might be absolute scales, or 
they might even be ratio scales, with each expert choosing a unit indepen- 
dently.) 


4. If each p,(t) is measured on the same interval scale, and if K is 

defined by Eq. (2.35), are either of the following statements meaningful? 
(a) K(p,(t), p2(t), - - - > Pa(t)) > K(p,(S), p25), - - - » Pa(s))- 
(b) K(p,(9), Pt), Cae Se > Prft)) os 2K(p\(s), PAs), Bence > Pr(S)). 

5. The Laspeyres price index (Laspeyres [1871]) is the consumer price 
index K, obtained from K of Eq. (2.35) by setting the weight A, equal to 
the quantity ¢,(0) of good i in the “average market basket” in year 0, and 
multiplying the value of K by 100. The Paasche price index (Paasche 
[1874]) is the consumer price index Kp, obtained from K by setting the 
weight A, equal to the quantity g(t) of good i in the “average market 
basket” in year ¢ and multiplying the value of K by 100. (The Laspeyres 
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Table 2.5. Hypothetical Prices and Quantities of Oil, Grain, 
and Wine in Italy in 1500 and 1750 


Item i p0)* PAt* 9,0) ai€t) 
Oil (qt) $.10 $.15 10 12 
Grain (bushel) $.30 $.25 10 16 
Wine (gal) $.10 $.20 10 10 


*Year 0 = 1500; year ¢ = 1750. 


index is the one usually used.)* The first price index was computed in 1764 
by Carli, to compare Italian prices in 1750 with prices in 1500. Carli 
limited consideration to three items— oil, grain, and wine. Let us suppose 
for the sake of discussion that Carli had obtained the basic price and 
quantity data of Table 2.5. 

(a) Calculate the Laspeyres price index for the year 1750 relative to 
the year 1500. 

(b) Calculate the Paasche index for the year 1750 relative to the year 
1500. 

(c) Calculate both indices for the year 1500 relative to the year 
1500. 

6. If all prices are measured in the same units, and quantities g,(0) and 
q(t) are assumed fixed, consider whether it is meaningful (in the wide 
sense) to compare the Laspeyres and the Paasche price indices, that is, 
whether the statement 


K,(p,(t), po(t), «- +» Pa(t)) > Kp(pi(t), p2(t), « - - > Pa(t)) 


is meaningful (in the wide sense). 


7. Comparing the consumer price index for two different cities, say 
New York and Los Angeles, one would use different base prices and 
different weighting factors, but the same base year. Consider the meaning- 
fulness of the following statements (using the index K). 

(a) In a given year, the consumer price index in New York was 
greater than the consumer price index in Los Angeles. 

(b) In a given year, the consumer price index in New York rose by a 
higher percentage over the previous year than did the consumer price 
index in Los Angeles. 


8. In a detailed study of consumer confidence, Pickering et al. [1973] 
identified twenty-three variables related to consumer confidence. These 
included financial position compared with the previous year, personal 
expectations for economic development for the next three years, whether it 
is viewed as a good time to buy consumer durables, whether one has a 
desire to buy durables, and what are the employment expectations for next 


*We return to these indices in Exer. 19, Sec. 4.2. 
+Boot and Cox [1970]. 
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Table 2.6. Typical Semantic Differential* 


Agree 
Agree Agree with Agree Agree 
Strongly Agree Slightly Neither Slightly Agree Strongly 
Fimancially, we, Financially, we, as 
as a family, are a family, are less 
better off than [ ] [ ] [ ] [ ] [ ] [] Fe well off than we 
we were a year were a year ago 


ago 


*From Pickering et al. [1973]. 


year. Questions about each variable were posed using the so-called seven- 
point semantic differential scale. A typical question looked like that shown 
in Table 2.6. Answers to these questions were scaled | to 7, where 7 was 
strongly agreeing with the most optimistic answer, 4 was agreeing with 
neither, etc. Arithmetic means of scale values were calculated. The survey 
was repeated later. 

(a) The mean answer to the financial position question was 4.325 in 
the first survey (February 1971) and 4.722 in the second survey (May 
1971). Consider whether or not it is meaningful to assert that the general 
financial position of families relative to their year-ago position improved 
from February 1971 to May 1971. 

(b) The twenty-three variables identified do not contribute equally 
to consumer confidence, and many of these variables overlap. A statistical 
analysis called principle components analysis was performed to find rela- 
tive weights of importance for each variable. (The analysis depended on 
the original data, not the arithmetic means.) If g,(7) is the relative weight of 
variable i at the tth survey, and p,(t) is the mean value of the answer to the 
question about variable i in the /th survey, Pickering ef a/. point out that 
one can use the standard Laspeyres index (Exer. 5) to calculate an index of 
consumer confidence: 


2=q,(0)p,(t) 
2=q;(0)p;(0) 


Consider the meaningfulness of the statement /(t) > /(0). 

(c) Pickering er al. suggest using a cross between a Laspeyres and a 
Paasche index as a measure of consumer confidence. The measure pro- 
posed is sometimes called an ideal Fisher index, and is calculated as 


I(t) = 


=q,(0)p,(1) =q,(t)p;(t) 
=q(0)p,(0) © q,(t)p,(0) * 


This index, the authors claim, has the advantage of combining both 
current and base weights, and so allows for change in relative importance 


94 Fundamental Measurement, Derived Measurement, and the Uniqueness Problem 2.6 


of different variables and their pattern of relation to purchasing behavior. 
Assuming that weights q,(t) and q,(0) are fixed, consider the meaningful- 
ness of the following statements: 

(i) J() > JO). 

(ii) J(t) = 2.7 J(O). 
(If the weights are not thought of as fixed, but are thought of as derived 
scales based on the raw data, the situation becomes more difficult to 
analyze.) 

9. If mass were just an interval scale and all e(r, ¢, k) used the same 

unit of mass, would the statement 


2 e(i,t,k) > = = ej, t,k) 
tk t,k ji 
be meaningful? 
10. The severity tonnage of a given pollutant 7 due to a given source is 
the actual tonnage times the severity factor. Table 2.7 shows various 


Table 2.7. Annual Severity Tonnage due to Emissions by Various Pollutants from Various 
Sources* 


Pollutant Source Annual Severity Severity 
Quantity Factor Tonnage 
(10° tons) 

Hydrocarbons Transportation 19.8 125 2480.0 
Miscellaneous 9.2 125 1150.0 

Industry 5.5 125 688.0 

Solid waste disposal 2.0 125 250.0 

Stationary fuel combustion 0.9 125 112.5 

TOTAL 37.4 4680.5 

Nitrogen oxides Transportation 11.2 22.4 251.0 
Stationary fuel combustion 10.0 22.4 224.0 

Miscellaneous 2.0 22.4 44.8 

Solid waste disposal 0.4 22.4 9.0 

Industry 0.2 22.4 4.5 

TOTAL 23.8 533.3 

Sulfur oxides Stationary fuel combustion 24.4 15.3 373.2 
Industry 15 15.3 114.5 

Transportation 1.1 15.3 16.8 

Miscellaneous 0.2 15.3 3.1 

Solid waste disposal 0.2 15.3 3.1 

TOTAL 33.4 510.7 

Carbon monoxide Transportation 111.5 1 111.5 
Miscellaneous 18.2 1 18.2 

Industry 12.0 1 12.0 

Solid waste disposal 79 1 19 

Stationary fuel combustion 1.8 1 18 

TOTAL 151.4 151.4 


*Data from Walther [1972]. 
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pollutants, sources, and severity tonnages. From the table we can make the 
following statements. Consider which of them is meaningful. 

(a) Hydrocarbon emissions are more severe (have greater severity 
tonnage) than nitrogen oxide emissions. 

(b) The effects of hydrocarbon and nitrogen oxide emissions from 
transportation are more severe than those of hydrocarbon and nitrogen 
oxide emissions from industry. 

(c) The effects of hydrocarbon and nitrogen oxide emissions from 
transportation are more severe than those of carbon monoxide emissions 
from industry. 

(d) The effects of hydrocarbon emissions from transportation are 
more than twenty times as severe as the effects of carbon monoxide 
emissions from transportation. 

(e) The total effect of hydrocarbon emissions due to all sources is 
more than eight times as severe as the total effect of nitrogen oxide 
emissions due to all sources. 


11. (Pfanzagl [1968, p. 47] Show that if f is an interval scale, then the 
following comparison is meaningful: 


[ai B@ -B]> [Fr 80) HV] 


n-li n-li 


where 
R=+ 3 sa) f=+ = 0) 
a hin 4%” ee isl 


12. Table 2.8a shows the results of an experiment conducted in England 
to compare the performance of different stereo speakers. Relevant listening 
parameters were identified. Each expert rated each speaker on each param- 
eter (rating up to 10), and the sum of the experts’ scores multiplied by a 
weighting factor is shown in each column under the brand of speaker. The 
total subjective score of. a particular speaker is obtained by adding the 
numbers in its column. The subjective scores are used to rank-order the 
speakers in Table 2.8b, first column. That is, suppose w, is the weighting 
factor of the ith parameter, and f,(a) is the rating of speaker a by the jth 
expert on the ith parameter. Speaker a is ranked over speaker b if and only 
if 


= [2 /,(a) | > = [mE s,(b) |. (2.40) 


(a) Suppose f, defines an absolute scale and w = w(i) = w, defines 
a ratio scale. 
(i) Show that the comparison (2.40) is meaningful. 
(ii) Consider what happens if w, defines an interval scale. 
(b) Suppose w is an absolute scale. Show that the situation is similar 
to the importance ratings comparisons: if for all i, j, f, defines the same 


Table 2.8. Comparisons of Stereo Speakers* 


(a) 


Subjective Scores 


(Five assessors, each awarding up to fuil mark of ten for each parameter [5 x 10 = 50]. Wide range of music, 


known voice, white and pink noise) 


Maximum 
Listening Weighting Possibie Marsden Omai SMC Goodmans Quasar 
Parameters Factor Score Hali 
(Ww) (W x 50) 
Smoothness 1.0 50 37 36; 283 33 334 
Mid-frequency 1.0 50 34 343 344 353 
coloration 
Overall tonal 0.95 47} 35 32 333 37 373 
balance 
Transients 0.87 433 32 31 31 323 32 
High-frequency 0.71 35} 26 25 223 25 243 
performance 
Low-frequency 0,66 33 24 26; 22} 24 234 
performance 
TOTAL SCORE _ 259} 188 185 169 186 1865 
Rank Order (on above scores) 1 4 7 3 2 
(b) 


Order of Preference 


(from Subjective Score) 


Order on Basis of Performance for the Money 
(Subjective Score Divided by Price) 


Marsden-Hall 


Quasar 


Goodmans 


Omal 
Sansui 
Dahlquist 
SMC 


SMC 

Sansui 
Quasar 
Marsden-Hall 
Goodmans 
Omal 
Dahlquist 


*Data from HiFi News & Record Review, 1975 page i107. The author thanks [ssie 


Rabinovitch for showing him this data. 


Dahlquist 


Sansui 
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ratio scale, then the comparison (2.40) is meaningful. ({t is not clear that 
this is a reasonable assumption for this example; what instructions were 
given the experts is a crucial missing piece of information.) 

(c) If the weighting factor w defines a ratio scale and the f, all 
define a common ratio scale, is the comparison (2.40) still meaningful? 

(d) Table 2.8b rank-orders the speakers on the basis of subjective 
score per dollar of price, a measure of “performance for the money.” 
Consider whether this ranking is meaningful. We return to this example, 
and present some critical remarks, in the exercises of Sections 5.4 and 6.2. 


13. Table 2.9 presents the results of a survey of seven experts by the 
President’s Council on Physical Fitness and Sports. The entry in column /, 
row i is the sum of the ratings by the experts of the value of exercise j 
under criterion i. (Ratings by each expert were based on a scale of 0 to 3.) 
Consider what conclusions attainable from this table are meaningful. 

14. Apply the theory developed in this section to decide when it is 
meaningful to say that 


(a) one school district’s average reading score or average intelligence 
test score is higher than another’s; 


Table 2.9. Ratings of Different Forms of Exercise* 


s 
: i 
5 2 ry 
58 2 | : 
oo 2 > $ gs 3 
oo 2 & wo OS & # < oT) 2 = w 
Peer S23 2 2 5 2 FE 
3 3 3 g s 4 S & 3 3 6 & 2 
8S a4 4 = 4 8 &@ € S&S BOB A 
PHYSICAL FITNESS 
Cardiorespiratory 21 #19 #21 #18 «619 «19061906 (61606Cd16 10s 2B SS 
endurance (stamina) 
Muscular endurance 20 #18 #20 417 #18 «19 «17:0 «18 ~«6©1606«13:°:«614@0—6 6868 US 
Muscular strength 7 16 #14 #+15 #15 #15 15 15 14 16 Tl 9 7 5 
Flexibility 9 9 15 13 16 14 #%13 14 #14 «19 7 8 9 #7 
Balance 17. 18 «#12 20 #17) 16 16 21 16 15 8 8 7 6 
GENERAL WELL-BEING 
Weight control 21 20 #15 17 #19 17 19 15 16 12 136 7 «5 
Muscle definition 4°15 14 #14 «V1 12) 13) «14 ~«1306«18:h6UWt 6 65S 
Digestion 3.9 $12 #13 «IW 13° 12) «10 9 AR OM dl) op Be 9 
Sleep 16 #15 16 15 12 35 12 12 «D1 12 14 6 7 6 
TOTAL 148 142 140 140 140 139 134 134 128 126 102 66 64 51 


*Data from Medical Times, May 1976; data obtained from a survey of seven experts by the President’s 
Council on Physical Fitness and Sports. 

tRatings for golf are based on the fact that many people ride a golf cart. Physical-fitness values improve 
if one walks in golf. 
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(b) one student’s grade point average is higher than a second 
student’s; 

(c) one President’s average popularity rating over his term was 
higher than another President’s. 
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CHAPTER 3 _ 


Three Representation Problems: 
Ordinal, Extensive, and Difference 
Measurement 


3.1 Ordinal Measurement 
3.1.1 Representation Theorem in the Finite Case 


In this chapter, we study the representation problem of fundamental 
measurement. We study two representations that arose in our discussion of 
temperature, preference, mass, and the like. These are the representations 
(A, R)— (Re, >), and (A, R, 0) —» (Re, >, +), where R is a binary rela- 
tion on A, and 0 is an operation on A. We present axioms on (A, R) and 
(A, R, ©) necessary and sufficient for the existence of the desired homo- 
morphisms, and we present a uniqueness theorem for each of the repre- 
sentations. We then study a third representation, which also arises in the 
measurement of temperature and of preference. 

In this section we illustrate the simplest case of fundamental measure- 
ment, that dealing with the relational system (A, R). We seek a real-valued 
function f on A such that for all a, b € A, 


aRb = f(a) > f(b). (3.1) 
This representation arose in our discussion of temperature and our discus- 


sion of preference, and it arises in many measurement situations. We begin 
with a representation theorem for the case where A is finite. 


THEOREM 3.1. Suppose A is a finite set and R is a binary relation on A. 
Then there is a real-valued function f on A satisfying 


aRb = f(a) > f(b) (3.1) 
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if and only if (A, R) is a strict weak order. 


Before beginning the proof, let us recall that (A, R) is strict weak if and 
only if it is 

(i) asymmetric: aRb > ~ bRa 
and 

(ii) negatively transitive: ~ aRb & ~ bRc => ~ aRc. 

If R is (strict) preference, a function f satisfying Eq. (3.1) is called an 
(ordinal) utility function. Thus, to see if we can measure a person’s 
preferences to the extent of producing an ordinal utility function, we 
simply check whether or not these preferences satisfy the conditions of 
asymmetry and negative transitivity. In general, we could do this by doing 
- @ pair comparison experiment. For every pair of alternatives a and b in A, 
we present a and b and ask the individual to tell us which, if any, he 
prefers. We present these pairs in a random order, and use his judgments 
to define preference. Then we check if asymmetry and negative transitivity 
are satisfied. (The results might be presented in a table like Table 3.1, 
which shows an individual’s preferences among several composers.) 

As we remarked in the Introduction, there are two interpretations for 
axioms such as asymmetry and negative transitivity. One is that these are 
testable conditions which describe what a person’s preferences must be for 
measurement to take place. We then take the descriptive approach, and 
simply ask whether or not a person’s preferences satisfy these conditions. 
Alternatively, we could use these axioms to define rationality. We could 
say that an individual who violates these axioms is acting irrationally. 
Indeed, many would say that an individual presented with a violation of, 
say, negative transitivity would say: “Oh, I’ve made a mistake.” This 
approach is the prescriptive or normative approach, and it is usually the 
approach taken in economic theory. The representation theorem is used to 
define the class of individuals to whom the theory applies, the so-called 
rational individuals.* It is this second approach or interpretation that we 
apply to this representation theorem if it is used to study measurement of 
temperature. If R is the relation “warmer than,” we think of the conditions 
of asymmetry and negative transitivity as conditions of rationality, which 
must be satisfied before measurement can take place. 

Of course, in the case of warmer than, we can also think of these 
conditions as testable, and we would be surprised if an individual violated 
them, or at least we might be tempted to think of violations more as 
experimental errors than “real” violations. In the case of preference, the 
situation is different. If we subject the asymmetry and negative transitivity 
conditions to an experimental test, with R taken to be strict preference, 
then we often find they are violated. For example, an individual may think 


*See footnote on page 4 for a distinction between prescriptive and normative, between the 
idealized “superrational being” and the “normally intelligent” individual. This is a distinction 
made in Keeney and Raiffa [1976]. 
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Table 3.1. Preferences among Composers 
(Item i is preferred to item / iff the i, 7 entry is 1.) 
Mozart Haydn Brahms Beethoven Wagner Bach Mahler Strauss Row Sum 


Mozart 0 1 1 0 0 0 1 1 4 
Haydn 0 0 1 0 0 0 1 0 2 
Brahms 0 0 0 0 0 0 0 0 0 
Beethoven | 1 1 1 0 1 0 1 1 6 
Wagner 0 1 1 0 0 0 1 1 4 
Bach 1 1 1 0 1 0 1 1 6 
Mahler 0 0 1 0 0 0 0 0 1 
Strauss 0 1 1 0 0 0 1 0 3 


price is more important than quality, but choose on the basis of quality if 
prices are close. Thus, if a and b are close in price and b and c are close in 
price, he may prefer a to b because a is of higher quality than b, and he 
may prefer b toc because 6 is of higher quality than c. But he may prefer c 
to a because c is sufficiently lower in price than a to make a difference. 
Then transitivity of preference is violated. Moreover, so is negative transi- 
tivity, for he does not prefer c to b and he does not prefer b to a, but he 
prefers c to a.' If one of the axioms such as negative transitivity is violated, 
then the representation (3.1) cannot be achieved. 

If the violation of a measurement axiom is systematic—that is, if it has 
some sort of pattern—then often a different measurement representation 
must be sought. In the case of preference where some form of transitivity is 
violated, we shall describe in detail one such alternative representation 
(using the notion of semiorder) in Section 6.1, and we shall mention 
another (the additive difference model) in Section 5.5. 

Alternatively, seeing a systematic violation of a measurement axiom, one 
can find some “statistical” pattern to it. Indeed, as Falmagne [1976a] 
argues, statistical regularities are the only regularities one is likely to find, 
at least in the behavioral sciences. Hence, a statistical or random analogue 
of the deterministic fundamental measurement theories must be developed. 
Without this, Falmagne argues, the measurement theory models we de- 
scribe will not pass many experimental tests. Falmagne [1976a, 1978, 1979] 
and Falmagne, Iverson, and Marcovici [1978] begin to make progress on 
such random analogues of deterministic measurement theories. A related 
approach based on the notion of probabilistic consistency is described in 
Sec. 6.2 below. 


tThis argument is due to Krantz et al. [1971, p. 17]. We shall present other arguments 
against these simple axioms for preference later in this volume. 
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Sometimes, the violation of a measurement axiom could be due to a 
matter of minor errors or “noise.” In that case, a theory of error or noise is 
called for. See Adams [1965] or Krantz et a/. [to appear] for a discussion of 
error theories and Adams and Carlstrom [1979] for a new approach to 
error. In a situation of error or noise, some statistical discussion of 
goodness of fit of a measurement representation is called for. The literature 
of measurement theory has not been very helpful on the development of 
such statistical tests. For some recent foundational work on this subject, 
see Falmagne [1976b]. In general, much foundational work still needs to be 
done along these lines. Much practical work also needs to be done, to 
subject the measurement axioms we shall present in different places in this 
volume to a systematic experimental test in a variety of contexts. 

To prove Theorem 3.1, let us suppose first that there is a homomorphism 
f satisfying Eq. (3.1). We show that (A, R) is strict weak. First, (A, R) is 
asymmetric. For if aRb, then f(a) > f(b), whence not f(b) > f(a), and 
~ bRa. Second, (A, R) is negatively transitive. For if ~ aRb and ~ bRc, 
then ~ [f(a) > f(6)] and ~ [f(5) > f(o)], so f(a) S f(b) and f(b) S f(c). 
It follows that f(a) S f(c), so ~[f(a) >f(c)], so ~ aRc. 

Conversely, suppose (A, R) is a strict weak order. The proof that a 
homomorphism f exists is constructive. We define f(x) as follows: 


f(x) = the number of y in A such that xRy. (3.2) 
Let us begin by illustrating this construction. Suppose A = {a, b, c, d} and 
R = {(a, c), (a, d), (b, c), (b, 2), (c, d)}. 


Then it is easy to show that (A, R) is a strict weak order. The function f 
defined by (3.2) is given by 


f(a) =2 (aRc, aRd), 
f(b) = 2, 
f(c) = 1, 
f(d) = 0. 


It is easy to check that f is a homomorphism. 

To prove formally that a function f defined by Eq. (3.2) satisfies Eq. 
(3.1), we recall that by Theorem 1.3, every strict weak order is transitive. If 
aRb, then by transitivity of R, bRy implies aRy for every y. Thus, the 
number of y such that aRy is at least as big as the number of y such that 
bRy. It follows that f(a) 2 f(b). Moreover, aRb but not bRb, since a strict 
weak order is irreflexive. Thus, f(a) > f(b). Conversely, if ~ aRb, then 
~ bRy implies ~ aRy, by negative transitivity. Hence, aRy implies bRy, so 
f(b) 2 f(a), whence ~[f(a) > f(b)]. This proves (3.1) and completes the 
proof of Theorem 3.1. 

It should be remarked that if A is finite and f defined by (3.2) does not 
give a homomorphism, then (A, R) is not strict weak, and so there is no 
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homomorphism. Thus, this gives us another test for existence of a homo- 
morphism—simply try to build one by a fixed procedure and verify 
whether or not you have succeeded. We state the result as a corollary. 


COROLLARY |. Suppose A is a finite set and R is a binary relation on A. 
Then there is a real-valued function f on A satisfying Eq. (3.1) if and only if 
the following function f satisfies Eq. (3.1): 


S(x) = the number of y such that xRy. (3.2) 


Let us apply these ideas to measurement of preference. Suppose an 
individual is asked his preferences among composers in a pair comparison 
format, and gives the data shown in Table 3.1. Then f(x) as defined by Eq. 
(3.2) is given by the row sum of row x. Specifically, we obtain 


JS (Beethoven) = f(Bach) =6, 

JS(Wagner) = f(Mozart) =4, 

J (Strauss) = 3, 

f(Haydn) =2, 

J (Mahler) = 1, 

J (Brahms) = 0. 
If the matrix is rearranged so that alternatives are listed in descending 
order of row sums (with arbitrary ordering in case of ties), then we can 
easily test whether f is a homomorphism by checking that there are 1’s in 
row x for all those y with f(y) < f(x). This should give a block of |’s in 
row x from some point to the end. In our example, f is a homomorphism. 
(The rearranged matrix is shown in Table 3.2.) Thus, f is an ordinal utility 
function for this individual. 

On the other hand, if preferences are expressed as in Table 3.3, then f(x) 

as defined by Eq. (3.2) is 

J (Mozart) = f(Haydn)=6, 

J(Beethoven)= 5, 

J(Brahms) = 4, 

f(Wagner) = 3, 

J(Mabler) = 2, 

Jf(Bach)= 1, 

J (Strauss) = 0. 
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Table 3.2. Table 3.1 Rearranged, with the Blocks of 1’s Shown 
Beethoven Bach Wagner Mozart Strauss Haydn Mahler Brahms Row Sum 


Beethoven | 0 0 6 
Bach 6 
Wagner | 0 80 000 0 Fay | 4 
Moat [0 0 0 o [tt 4a) ] 4 
Strauss 0 0 0 0 0 Ci 1 3 
Haydn 0 0 0 0 0 o fi 1) 2 


Mahler 0 0 0 0 0 0 0 [1] 1 


Brahms 0 0 0 0 0 0 0 0 0 


Table 3.3, Preferences among Composers 
(Item i is preferred to item j iff the i,j entry is 1.) 
Mozart Haydn Brahms Beethoven Wagner Bach Mahler Strauss Row Sum 


Mozart 0 1 1 0 1 1 1 1 6 
Haydn 0 0 1 1 1 1 i 1 6 
Brahms 0 0 0 0 1 1 1 1 4 
Beethoven 1 0 0 0 1 1 1 1 5 
Wagner 0 0 0 0 0 1 1 1 3 
Bach 0 0 0 0 0 0 0 1 1 
Mahler 0 0 0 0 0 1 0 1 2 
Strauss 0 0 0 0 0 0 0 0 0 


Rearranging the matrix, we see in Table 3.4 that the row corresponding to 
Mozart has its block of 1’s broken by a 0 in the Mozart, Beethoven entry. 
We discover that Beethoven is preferred to Mozart, even though 
J (Beethoven) < f(Mozart). Thus, f does not define a homomorphism and 
so, by Corollary 1 to Theorem 3.1, there can be none. There is no ordinal 
utility function. This implies that one of the strict weak order axioms will 
have to be broken, and it is not hard to discover that negative transitivity is 
violated: Beethoven is preferred to Mozart, but Haydn is not preferred to 
Mozart, and Beethoven is not preferred to Haydn. 
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Table 3.4. Table 3.3 Rearranged 
Mozart Haydn Beethoven Brahms Wagner Mahler Bach Strauss Row Sum 


Mozart 0 1 @ 1 1 1 1 1 6 
Haydn 0 0 1 1 1 l I 1 6 
Beethoven 1 0 0 0 1 l l l 5 
Brahms 0 0 0 0 1 l l l 4 
Wagner 0 0 0 0 0 1 1 1 3 
Mahler 0 0 0 0 0 0 1 1 2 
Bach 0 0 0 0 0 0 0 l 1 
Strauss 0 0 0 0 0 0 0 0 0 


It can be a rather time-consuming process to perform a pair comparison 
experiment and make an individual compare every pair of elements. The 
set of pairs can be very large. In Chapter 7 we describe an alternative 
practical procedure for calculating an ordinal utility function, if one exists, 
which avoids this difficulty. The procedure is based on the expected utility 
hypothesis. An extensive discussion of practical techniques for calculating 
utility functions which are quite different from the techniques we have 
described can be found in Keeney and Raiffa [1976]. Their idea is to assess 
the general shape of an individual’s utility function by judging his attitude 
toward risk. Then several data points are obtained and a curve of the 
determined shape is fitted. 

Before leaving Theorem 3.1, we state an additional corollary. 


COROLLARY 2. Suppose A is a finite set and R is a binary relation on A. 
Then there is a real-valued function f on A satisfying 


aRb = f(a) 2 f(b) (3.3) 
if and only if (A, R) is a weak order. 


Proof. If (A, R) is a weak order, define S on A by 
aSb = ~ bRa. 


Then (A, S) is a strict weak order. (Proof is left to the reader.) If f on A 
satisfies Eq. (3.1) with S in place of R, then it satisfies Eq. (3.3) for (A, R). 
The proof that if f satisfies Eq. (3.3) then (A, R) is weak is left to the 
reader. | 
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3.1.2 The Uniqueness Theorem 


Turning next to the uniqueness problem, we state a uniqueness theorem 
for the representation (3.1). 


THEOREM 3.2. Suppose A is a finite set, R is a binary relation on A, and f 
is a real-valued function on A satisfying 


aRb = f(a) > f(b). (3.1) 


Then X = (A, R) > B = (Re, >) is a regular representation and (A, B, f) 
is an ordinal scale. 


Proof. We first show that every function f satisfying Eq. (3.1) is a regular 
scale, so we are dealing with a regular representation. Suppose g is another 
function satisfying Eq. (3.1). Then f(a) = f(b) implies ~ aRb and ~ bRa, 
which implies g(a) = g(b). Regularity of f follows by Theorem 2.1. 

Next we show that (WU, 8, f) is an ordinal scale. If ¢:f(A)— Re is any 
monotone increasing function, then ¢ 0 f satisfies (3.1) whenever f does. 
For 


(¢ 0 f)(a) > (¢ 0 f)(b) <= f(a) > f(b) = aRb. 


Conversely, suppose f satisfies Eq. (3.1) and we are given a function 
:f(A) — Re such that $0 f also satisfies (3.1). We shall show that ¢ is 
monotone increasing. Suppose a and £ are in f(A), with a = f(a) and 
B = f(b). Then 


a > B = aRb = ($0 f)(a) > (p 0 f)(b) = $a) > o(B). 


We conclude that the class of admissible transformations is the class of 
monotone increasing functions, and so (2, 8, f) is an ordinal scale. | | 


Let us apply the uniqueness theorem (Theorem 3.2) to the case of 
temperature. The representation (3.1) arises in the measurement of temper- 
ature if R is interpreted as “warmer than.” Applying Theorem 3.2, we 
conclude that temperature is an ordinal scale, whereas in Section 2.3 we 
suggested that temperature is an interval scale. Is there something wrong 
with our whole formalism? The answer is that we have not made use of all 
the properties that a scale of temperature preserves. We can obtain 
temperature as an interval scale if we observe that it is possible to make 
judgments of comparative temperature difference. To make this precise in 
the theory, one introduces a quaternary relation D on a set A of objects 
whose temperatures are being compared. The relation D(a, b, s, t), or 
abDst, is interpreted to mean that the difference between the temperature 
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of a and the temperature of b is judged to be more than the difference 
between the temperature of s and the temperature of ¢. One seeks a 
real-valued function f on A such that for all a, b, s, t € A, 


abDst = f(a) — f(b) >f(s) — f(). (3.4) 


Under some reasonable assumptions, this numerical assignment is regular 
and unique up to a positive linear transformation and hence defines an 
interval scale. We return to the representation (3.4) in Section 3.3. 

It should be emphasized here how the desired properties of a scale can 
influence our choice of a representation. We were able to measure temper- 
ature in the sense of Eq. (3.1), but obtained a scale that did not have strong 
enough properties. Thus, we were led to seek a more stringent representa- 
tion, but also one that can be based on sensible empirical relations. 


3.1.3 The Countable Case 


Most practical applications of measurement or scaling deal with the case 
of a finite set of objects A. However, if measurement representations are 
used to put measurement on a firm theoretical foundation, it is important 
to consider theoretically infinite populations. Thus, we shall frequently ask 
whether a result has an analogue for infinite sets of objects. Theorems 3.1 
and 3.2 actually hold in the more general case where A is countable, that is, 
may be put in one-to-one correspondence with a set of positive integers.* 
The representation part of this result is due to Cantor [1895]. We proceed 
to prove it. 


THEOREM 3.3 (Cantor). Suppose A is a countable set and R is a binary 
relation on A. Then there is a real-valued function f on A satisfying 


aRb = f(a) > f(b) (3.1) 


if and only if (A, R) is a strict weak order. Moreover, if there is such an f, 
then UX = (A, R) > B = (Re, >) is a regular representation and (X, B, f) is 
an ordinal scale. 


Proof. Suppose (A, R) is strict weak. Since A is countable, we may list 
the elements of A as x,, x2, x3,... . Then, we define 


= | 1 if x,Rx,, 
0 otherwise. 


*Our use of the word countable includes finite sets, and we use the word denumerable for 
sets which are in one-to-one correspondence with the whole set of positive integers. 
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The definition of a function f that satisfies Eq. (3.1) is analogous to the 
explicit definition in Eq. (3.2). Namely, we take* 


f= B+, (3.5) 


mi? 


The sum in (3.5) converges, since 


j=l 


converges. Now, given a and b in A, we have a = x; and b = x,, some i 
and k. If aRb, then, as in the proof of Theorem 3.1, we observe that 
bRy = aRy and moreover that aRb but not bRb. Hence, by construction, 


Fx) 2 fC) + x > f(x); 


that is, f(a) > f(b). Conversely, if ~ aRb, then again as in the proof of 
Theorem 3.1, aRy = bRy, so f(b) 2 f(a), and ~[f(a) >f(b)]. This 
proves (3.1). 

The converse is proved just as it was in Theorem 3.1 and the uniqueness 
is proved just as it was in Theorem 3.2. | 


We next state several corollaries of Theorem 3.3. The first uses the 
notion of reduction defined in Section 1.5. 


COROLLARY 1. Suppose (A, R) is a strict weak order, (A*, R*) is its 
reduction, and A* is countable. Then there is a real-valued function f on A 
satisfying 


aRb = f(a) > f(b). (3.1) 


Proof. Recall that by Theorem 1.4, if (A, R) is a strict weak order, then 
(A*, R*) is a strict simple order and hence a strict weak order. By 
Theorem 3.3, there is a real-valued function F on A* satisfying Eq. (3.1) 
with R* in place of R. Then let f(a) = F(a*), where a* is the equivalence 
class in A* containing a. The function f satisfies (3.1) for R. a 


COROLLARY 2. Suppose A is a countable set and R is a binary relation on 
A, Then there is a real-valued function f on A satisfying 


aRb <= f(a) 2 f(b) (3.3) 


*This idea is due to David Radford (personal communication). 
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if and only if (A, R) is a weak order. Moreover, if there is such an f, then the 
representation XU = (A, R) > 8 = (Re, 2 ) is regular and (XA, B, f) is an 
ordinal scale. 


Proof. The proof is analogous to that of Corollary 2 of Theorem 3.1. 
3.1.4 The Birkhoff—Milgram Theorem 


It is easy to show that Theorem 3.3 is false without the assumption that 
A be countable. To give a counterexample, suppose we let A = Re X Re, 
and we define R on A by 


(a, b)R(s, 1) e[a>s or (a =s&b>2)]. (3.6) 


This relation (A, R) is called the lexicographic ordering of the plane. The 
lexicographic ordering of the plane corresponds to the ordering of words in 
a dictionary. We order first by first letter, in case of the same first letter, by 
second letter, and so on. It is easy to see that (A, R) is a strict weak order, 
indeed that it is strict simple. We shall show that there is no real-valued 
function f on A satisfying (3.1). For suppose that such an f exists. Now we 
know that (a, 1)R(a, 0). Thus, by (3.1), f(a, 1) > f(a, 0). We know that 
between any two real numbers there is a rational. Thus, there is a rational 
number g(a) so that 


F(a, 1) > g(a) > f(a, 0). 


Now the function g is defined on the set Re and maps it into the set of 
rationals. Moreover, it maps Re into the rationals in a one-to-one fashion. 
For, if a # b, then either a > b or b > a, say a > b. Then we have 


g(a) > f(a, 0) > f(b, 1) > g(d), 


from which we conclude that g(a) > g(b). It is well known that there can 
be no one-to-one mapping from the reals into the rationals. Thus, we have 
reached a contradiction. We conclude that the strict simple (weak) order 
(A, R) cannot be represented in the form (3.1). 

A theorem stating conditions on a binary relation (A, R) both necessary 
and sufficient for the existence of a representation satisfying (3.1) can be 
stated, even if A is uncountable. Any (A, R) that can be mapped homo- 
morphically into the real numbers must reflect the properties of the reals. 
In particular, the reals have a countable subset, the rationals, which is 
order-dense in the sense that whenever a > b for nonrationals a and b, 
there is a rational c such that a >c > b. In general, if (A, R) is a binary 
relation and B & A, let us say that B is order-dense in (A, R) if, whenever 
a and b are in A — B, aRb and ~ bRa,* then there is a c in B such that 


*For asymmetric relations, we do not have to assume ~ bRa, for this follows from aRb. 
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aRcRb. The concept of order-denseness is different from the more well- 
known concept of denseness: B is said to be dense in (A, R) if whenever a 
and b are in A, aRb and ~ bRa, then there is c in B such that aRcRb. 
Every dense subset B of (A, R) is order-dense. But the converse is not true. 
Take A to be the integers and B to be the even integers. Then between any 
two odd integers (elements of A — B) there is an even integer, but it is not 
true that between every two integers there is an even integer. 

Suppose we concentrate momentarily on strict simple orders. It turns out 
that what characterizes those strict simple orders representable in the form 
(3.1) is that they have a countable order-dense subset. (It is not necessarily 
true that they have a countable dense subset: the integers under the 
ordering > do not.) From the result about strict simple orders it will 
follow that a strict weak order is representable in the form (3.1) if and only 
if its reduction (A*, R*) has a countable order-dense subset. 


THEOREM 3.4 (Birkhoff—Milgram). Suppose (A, R) is a strict simple 
order. Then there is a real-valued function f on A satisfying 


aRb & f(a) > f(b) (3.1) 


if and only if (A, R) has a countable order-dense subset. Moreover, if there is 
such an f, then the representation U = (A, R) > B = (Re, >) is a regular 
representation and (A, B, f) is an ordinal scale. 


The representation part of this theorem was apparently first proved in 
Milgram [1939]. It is proved in Birkhoff [1948], though his proof is 
incomplete. We shall present a proof at the end of this section.* 

We now State several corollaries of Theorem 3.4. 


CoroLiary | (Birkhoff-Milgram Theorem).' Suppose (A, R) is a bi- 
nary relation. Then there is a real-valued function f on A satisfying 


aRb © f(a) > f(b) (3.1) 


if and only if (A, R) is a strict weak order and its reduction (A*, R*) has a 
countable order-dense subset. Moreover, if there is such an f, then the 
representation U = (A, R) > B = (Re, >) is regular and (A, B, f) is an 
ordinal scale. 


*It should be remarked that. while the condition of having a countable order-dense subset 
seems reasonable, it is not empirically testable, for it would be impossible to explicitly verify 
this condition for real data. Any experiment would only give finite data. Thus, this axiom is 
one we either accept or reject as a reasonable idealization. 

tBoth Theorem 3.4 and this Corollary are called the Birkhoff—Milgram Theorem. 
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Proof. Sufficiency follows from Theorem 3.4, for by Theorem 1.4, 
(A*, R*) is strict simple, and if F gives a representation for (A*, R*), then 
f(a) = F(a*) gives a representation for (A, R). To prove the converse, one 
proceeds exactly as in the proof of Theorem 3.1 to show that (A, R) is 
strict weak. To prove that (A*, R*) has a countable order-dense subset, let 
F:A* — Re be defined by F(a*) = f(a). It is simple to show that F is 
well-defined, for if a and b are in the same equivalence class, then ~ aRb 
and ~ bRa, so f(a) = f(b). Now F satisfies Eq. (3.1) for (A*, R*) and so, 
by Theorem 3.4, (A*, R*) has a countable order-dense subset. Uniqueness 
of f is proved just as in the proof of Theorem 3.2. B 


COROLLARY 2. Suppose (A, R) is a binary relation. Then there is a 
real-valued function f on A satisfying 


aRb <= f(a) 2 f(d) (3.3) 


if and only if (A, R) is a weak order and its reduction (A*, R*) has a 
countable order-dense subset. Moreover, if there is such an f, then the 
representation UM = (A, R) > 8 = (Re, 2) is regular and (A, B, f) is an 
ordinal scale. 


We now present several examples to illustrate the Birkhoff—Milgram 
Theorem. Suppose first that (A, R) is the lexicographic ordering of the 
plane, defined in Eq. (3.6). Then (A, R) is strict simple, and (A*, R*) = 
(A, R). Since we already know that (A, R) is not homomorphic to 
(Re, >), Theorem 3.4 tells us that (A, R) can have no countable order- 
dense subset. 

To give a second example, let A = [0, 1] U [2, 3] and let R be > on A. 
Then (A, R) is homomorphic to (Re, >), by the homomorphism f(a) = a. 
Now (A, R) is strict simple, so it follows by Theorem 3.4 that (A, R) has a 
countable order-dense subset. Such a subset is the set of all rationals in 
[0, 1] U [2, 3}. 

To give a third example, let A = [0, 7] U [27, 37], and let R be > on A. 
Then (A, R) is strict simple, and it is homomorphic to (Re, >) by the map 
f(a) = a. However, the set B of rationals in [0, 7] U [27, 37] does not 
form a countable order-dense subset of A. For 7 and 27 are in A — B, 
27Ra (and hence ~ 7R27), but there is no element c of B such that 
27RcRza. To obtain a countable order-dense set, we must add 7 and 27 to 
B. 

To give a fourth example, let A = {0, 1} and let R be > on A. Then 
(A, R) is homomorphic to (Re, >) by the map f(a) = a. The set {0, 1} is 
a countable order-dense subset of A. 

To give one final example, suppose that A = Re X Re and that 
(a, b)R(s, t) holds if and only if a >s. Then (A, R) is strict weak and 
(A*, R*) has a countable order-dense subset, consisting of all equivalence 
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classes that contain an element (s, ¢) with both s and ¢ rational. It follows 
from Corollary 1 to Theorem 3.4 that (A, R) is homomorphic to (Re, >). 
The homomorphism is easy to describe explicitly; it is f[(a, b)] = a. 

We close this section by proving Theorem 3.4. The reader may skip the 
proof without loss of continuity.* 

To present the proof, we use the following notion: If (A, R) is an 
asymmetric bmary relation and a,b € A are such that aRb holds but 
aRcRb fails for all c, then (a, b) is called a gap, and the points a and b are 
called, respectively, the Jower and upper end points of the gap. In our third 
example above, (27, 7) is a gap. Our preliminary result, which follows one 
of Krantz et al. [1971], is about the set G of all a which are end points of 
some gap. 


LEMMA. Suppose (A, R) is an asymmetric, complete binary relation and G 
is the set of all a which are end points of some gap. Then G is countable if 
either (A, R) has a countable order-dense subset or there is a real-valued 
function f on A satisfying (3.1). 


Proof. Let G, be the set of lower end points of gaps and G, be the set of 
upper end points. Suppose first that there is a countable order-dense subset 
B. Whenever (a, b) is a gap, order-denseness of B implies that either a € B 
or b € B. Thus we may map G, — B into B in a one-to-one manner: if 
a ¢ B, map it into b. Similarly, we may map G,— B into B in a 
one-to-one manner. (Proof that the maps are one-to-one requires complete- 
ness of (A, R), and is left to the reader.) It follows that G, — B and 
G, — B are countable, since they are in a one-to-one correspondence with 
a subset of a countable set. Thus, G is countable, since 


G=(G,- B)U(G,- B) U(BNG) 


and the union of three countable sets is countable. 

Next, suppose that f satisfies (3.1). If (a,b) is a gap, then there is a 
rational r such that f(a) >r > f(b). Thus, we may define a one-to-one 
correspondence between G, and a subset of the set of rationals, and 
similarly for G,. We shall show that the correspondences are one-to-one. 
Since the rationals are countable, so are G, and G,, and hence so is G, 
which completes the proof. 

We show that the correspondence between G, and a subset of the 
rationals is one-to-one. The proof for G, is similar. Suppose (a, b) and 


*An alternative proof, which for the most part avoids the unpleasant notion of “gap” that 
arises m the following, is outlined in Exers. 19 through 21. This proof is due to David 
Radford (personal communication). 
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(a’, b’) are two different gaps, and r and 7’ are rationals so that 


f(a) >r>f(b), 
f(a’) > > f(b’). 
It is easy to show that a a’ and b # 5B’. Since (a, b) is a gap, ~ aRa’ or 


~ a’Rb. Hence, by completeness, a’ Ra, b = a’, or bRa’. In the latter two 
cases, 


r>flb)2fa)>r, 


so r #r’. In the former case, since (a’, b’) is a gap and a’Ra, we have 
~ aRb’. Thus, we have b’ = a or b’Ra. In either case, 


r > f(b’) 2 f(a) >r, 
sor £r’. a 


Suppose (A, R) is strict simple. Suppose first that (A, R) has a countable 
order-dense subset B and let B’ = B U G. We construct a function f 
satisfying (3.1). By the lemma, B’ is countable. Thus we may list the 
elements of B’ as x), x2, x3,... . If x andy are any elements of A, define 


1 if xRy, 
r(x, y) = 
oy) 0 otherwise. 


Analogously to Eq. (3.5), define* 


ca | 
f(x) = 2, yi x;). (3.7) 


Then, the sum in (3.7) converges, since 


2.1 


j=l 


does. Moreover, f satisfies Eq. (3.1). For, suppose that aRb. Then bRx, 
implies aRx,. Moreover, there is some j so that aRx, but not bRx,. This is 
immediate if 5 is in B’. If b is not in B’, it is not in G, so (a, b) is not a gap 
and there isc € A with aRcRb. If c is in B’, then we may take x, = c. If c 
is not in B’, then by order-denseness of B, there is d in B so that cRdRb. 
We may take x, = d. In any case, the conclusion is that 


fla) = f(b) + 5 > f(b), 


*Once again, this idea is due to David Radford (personal communication). 
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so aRb implies f(a) > f(b). Conversely, if ~ aRb, then as in the proof of 
Theorem 3.1, aRy implies bRy, so f(b) 2 f(a), and ~[f(a) > f(d)]. 

To prove the converse, suppose that there is a function f on A satisfying 
Eq. (3.1). We show that there is a countable order-dense subset B of A.* 
Let J be the set of ordered pairs (r, r’) such that r and r’ are rational 
numbers and r’ > f(a) >r for some a € A. For all (r, r’) € J, let a, be 
one such element a, and let B, = {a,,}.* The set B, is countable because 
the Cartesian product of the set of rationals with itself can be mapped onto 
B,, and it is well-known that the Cartesian product of a countable set with 
itself is countable. Let G be the collection of end points of gaps in (A, R). 
By the lemma, G is countable, and hence so is B = G U B,. We show that 
B is order-dense. Suppose aRb holds. If (a, 5) is a gap, then a, b € B. If 
(a, b) is not a gap, then there is ac € A such that aRcRb. Choose rationals 
r and 7’ such that f(a) > r’ > f(c) > r > f(b). Then (7, r’) € J, and a,,. is 
such that f(a) > f(a,,) > f(b) and a,, is in B. 

The uniqueness statement of Theorem 3.4 is proved just as that of 
Theorem 3.2. This completes the proof. 


Exercises 


1. Let A = {0, 1, 2, 3, 4}, and let R be < on A. Show the following: 
(a) The function f of Eq. (3.2) is given by f(x) = 4 —- x. 
(b) (A, R) is homomorphic to (Re, >) via f. 

2. A strict weak ordering R can be defined by ranking alternatives in 
vertical order, with aRb if and only if a is higher than 5 in the order. Show 
that the order is not necessarily strict simple, as long as two elements can 
be equally high in the list. 


3. (a) The following preference ranking of cities as places to visit 
defines a strict weak ordering as in Exer. 2: 


Rome-London-Copenhagen 
Paris 

Athens 

Vienna 

Moscow-Stockholm 

Brussels 


*Our proof again follows that of Krantz et al. [1971]. 

tThe choice of one representative from each of an infinite class of sets involves the 
assumption known as the Axiom of Choice. We do not dwell on this point here, but refer the 
reader to works on set theory like that of Bernays [1968] for a discussion of the Axiom of 
Choice and its numerous equivalent versions. The referees have pointed out that the 
Birkhoff~Milgram Theorem can be proved without the use of the Axiom of Choice. 
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Show that an ordinal utility function is given by 


f(Rome) = f(London) = f(Copenhagen) = 6, 


Jf (Paris) = 5, 
Jf (Athens) = 4, 
Jf (Vienna) = 3, 


Jt (Moscow) = f(Stockholm) = 1, 
J (Brussels) = 0. 


(b) Find ordinal utility functions for the following preference rank- 
ings: 
(i) Preference for cars 
Buick 
Cadillac-Volkswagen 
Datsun—Toyota 
Chevrolet 
(ii) Preference for foods 
Steak 
Lobster 
Roast beef—Chicken 
Sole-Flounder—Cod 
Hamburger 
4. (a) Show that an ordinal utility function for the preference data of 
Table 3.5 is given by 
Jf(tennis) = f(football) = 10, 
J (track) = 7, 
J (swimming) = f (basketball) = 1, 
FS (baseball) = 0. 


(b) Determine whether or not an ordinal utility function exists for 
the data of Table 3.6. 

(c) Repeat for Table 3.7. 

(d) Consider the judgments of relative importance among objectives 
for a library system in Dallas shown in Table 1.8 of Chapter 1. If R is 


Table 3.5. Preferences among Sports Activities 
(Sport i is preferred to sport j iff the i,j entry is 1.) 


Tennis Baseball Football Basketball Track Swimming 


Tennis 0 1 0 1 1 1 
Baseball 0 0 0 0 

Football 0 1 0 1 1 1 
Basketball 0 1 0 0 0 0 
Track 0 1 0 1 0 1 
Swimming 0 1 0 0 0 0 
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Table 3.6. Preferences among Cars 
(Car i is preferred to car j iff the i,j entry is 1.) 


Cadillac Buick Oldsmobile Toyota Volkswagen Chevrolet Datsun 


Cadillac 0 0 0 0 0 0 0 
Buick 1 0 1 1 1 1 
Oldsmobile 1 0 0 0 0 1 1 
Toyota 1 0 0 0 0 1 1 
Volkswagen 1 0 0 0 0 1 1 
Chevrolet 0 0 0 0 0 0 0 
Datsun 0 0 0 0 0 0 0 


Table 3.7. Preferences among Cars 
(Car i is preferred to car / iff the i, entry is 1.) 


Cadillac Buick Oldsmobile Toyota Volkswagen Chevrolet Datsun 


Cadillac 0 1 0 0 0 0 0 
Buick 0 0 | | 1 1 

Oldsmobile 1 0 0 0 0 1 1 
Toyota 1 0 0 0 0 1 1 
Volkswagen | 0 0 0 0 1 1 
Chevrolet 0 0 0 0 0 0 0 
Datsun 0 0 0 0 0 0 0 


Table 3.8. Taste Preference for 
Vanilla Puddings* (Entry i,j is 1 iff 
pudding j is preferred to pudding j by 


the group of judges.) 
1 2 3 4 5 
| 0 0 1 1 0 
2 | 0 0 1 0. 
3 0 1 0 0 0 
4 0 0 1 0 0 
5 | 1 0 1 0 


*Data obtained from an experiment of 
Davidson and Bradley [1969]. 


“more important than,” is there a function f on {a, b, c, d, e, f} satisfying 
Eq. (3.1)? 

(e) Repeat part (d) for the judgments of relative importance among 
goals for a state environmental agency in Ohio shown in Table 1.9 of 
Chapter 1. 


5. (a) The data of Table 3.8 is obtained from an experiment of David- 
son and Bradley [1969] in which various brands of vanilla pudding were 
compared as to taste or flavor. It represents the consensus preferences of a 
group (the group is said to prefer i to / if a majority of its members do). 
Does an ordinal utility function exist? 

(b) Group preferences can be nontransitive even when each individ- 
ual is transitive. To give an example, suppose (A, P,) is the preference 
relation of the ith member of a group for elements in a set of alternatives 
A. Suppose we define group preference P on A by the simple majority rule: 
aPb if and only if for a majority of i, aP,b. Show that even if all P, are 
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transitive, even if they are all strict simple orders, P does not have to be 
transitive. (This result is known as the voter’s paradox or Condorcet’s 
paradox, after the philosopher and social scientist Marie Jean Antoine 
Nicolas Caritat, Marquis de Condorcet, who discovered it in the eighteenth 
century. For a discussion, see, for example, Riker and Ordeshook [1973] or 
Roberts [1976].) 


6. Suppose A is a set of sounds. In psychophysics, one often runs an 
experiment in which sounds a and b from A are presented and a subject is 
asked which is louder. Subjects can be inconsistent. Hence, pairs a, b from 
A are presented together a number of times, and the experimenter records 
P,» the proportion of times that a is judged louder than b. It is commonly 
assumed that a is judged definitely louder than 6, denoted aRb, if and only 
if p,, 2 .75. Suppose A = {x, y, u, v, w} and 


x yp ou v w 

x {50 81 91 81 85 

_y |19 50 61 55 .78 
(Pad=* \"o9 39 |50 63 82 
» |.19 45 37 50 79 


w AS 22) 18) 21-50 


Find R and determine if there is a real-valued function f on A satisfying 
Eg. (3.1). (See Section 6.2 for a number of measurement representations 
arising from data p,,.) 
7. (a) If A = {1,2,..., 10}, R=>, and B= {1,2,..., 10}, show 
that B is a countable order-dense subset of A. 
(b) Is it countable dense? 
8. If A = the irrationals and R = >, show from the Birkhoff—Milgram 
Theorem that (A, R) is homomorphic to (Re, >). 
9. If A = Re and R =<, show from the Birkhoff—Milgram Theorem 
that (A, R) is homomorphic to (Re, >). 
10. Show that the lexicographic ordering on Q X Q is homomorphic to 
(Re, >). 
11. Which of the following relational systems (A, R) are homomorphic 
to (Re, >)? 
(a) A = {0, 1} x {0, 1}, R = lexicographic ordering on A.* 


*The lexicographic ordering R on a Cartesian product A, X A, X--- XA, of sets of real 
numbers is defined as follows: 


(4), a,..., 4,)R(8,, 62,...,5,) > (a, > 5) or (a, = 6, &a,>6,) or... 
or (a, = 6,&a,=6,&...&a,_,= 5,_,&a, >65,). 


More generally, if R, is a strict weak order on the set 4,, i = 1,2,..., 7, and if, on A,, aE,;6 
holds if and only if ~ aR;b and ~ 5R,a, then we can define a lexicographic ordering R on 
A,X A,X--+- XA, felative to R,, Rz,...,R, as follows: 


(4), 4, ..- , Gy) R(B,, bz, ..., B,) <> (a,R,b,) or (a ,E,5, & a,R,b,) or... 
or (@,£,6, & a,F,b,&...&a,_,E,_\b,_, & a,R,5,). 
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(b) A = Re X {0, 1}, R = lexicographic ordering on A. 

(c) A = {0, 1} x {0,1} X Re, R = lexicographic ordering on A. 
(d) A = {0,1} X Q X Re, R = lexicographic ordering on A. 

(e) A = Re X {0,1} X Re, R = lexicographic ordering on A. 


(f) A, = A, = Re, Rj = >, R, =<, R = lexicographic ordering on 
A, X A. 
(g) A, = Re X Re, A, = {0,1}, G, DRGs, Doa>s, R, =>, R 
= lexicographic ordering on A, X A, relative to Rj, R». 
12. If (A, R) is homomorphic to (Re, >), does it necessarily follow that 
(A, R) has a countable order-dense subset? 


13. (Fishburn [1970, p. 27]) Let A = [—1, 1] and define R on A by 
aRb = [|a| > |b] or (|a| = [b| and a > 5)]. 


Show that (A, R) has no countable order-dense subset. 


14. For each (A, R) of Exer. 11, identify the sets G, and G,, the lower 
and upper end points of gaps, respectively, and compute the cardinality of 
G, and G,. 

15. If f is an ordinal utility function on (A, R), which of the following 
assertions are meaningful? 

(a) f(a) > 2f (8). 
(b) f(a) #f (8). 
(c) a has the largest utility of any element of A. 

16. Suppose f is an ordinal utility function on (A, R) and g: A > Re isa 
derived scale defined by the condition 


g(a) > g(b) = f(a) > f(b). 


What is the scale type of g in the narrow sense? 


17. Suppose R is a binary relation on a set A, S is a binary relation on 
Re, and f is a homomorphism from Y = (A, R) into B = (Re, S) with 
[f(A)| >I. 

(a) If (4, B, f) is an ordinal scale, show that S is >, 2, <,or S$ 
and & — % is regular. 

(b) Show that (2, 8, f) could not be an interval scale. 

(c) Show that the conclusion of part (a) is false without the hypothe- 
sis that | f(A)| > 1. 


18. Prove that if (A, R) is an asymmetric, complete binary relation with 
a countable order-dense subset B and if G, is the set of lower end points of 
gaps, then we may map G, — B into B in a one-to-one manner. 


19. This exercise and the next two sketch an alternative proof of 
Theorem 3.4, which is due to David Radford (personal communication). 
Suppose (A, R) is an asymmetric binary relation. A subset O of A is called 
left-order-dense if, whenever aRb, then either b € O or there is c € O so 
that aRcRb. Show that if (A, R) is a strict simple order, then there is a 
countable left-order-dense subset of A if and only if there is a countable 
order-dense subset of A. (The proof uses the lemma on p. 114.) 
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20. Suppose (A, R) is a strict simple order and there is a countable 
left-order-dense subset O of A. Suppose x), x,, x3, ... are the elements of 
O. Show that Eq. (3.7) defines a function f satisfying Eq. (3.1). 


21. Suppose (A, R) is a strict simple order and there is a function 
f: A = Re satisfying (3.1). For each positive integer n and each integer k, 
Osk < PO+Y — 1, let 
k k+1 


gntl » 2h + grt 


Jin =(-2 + 


If f(x) is in J,,, for some x in A, let x,, be that x for which f(x) is 
maximum, if there is such an x, and be an arbitrary x in A for which f(x) 
is in J,,> if there is no x with f(x) maximum. Let 


O = {X4n: f(x) € Ign forsome x € A}. 


Show that O is a countable left-order-dense set in (A, R). 


22. Suppose (A, R) is a preference relation. If there is some measure of 
closeness on A, then we might want to require that the utilities assigned to 
elements a and b in A be close whenever a and b are close. To make this 
idea precise, we introduce a topology on sets A with strict weak orders R 
on A. This topology, called the R-order topology or the interval topology, 
and denoted 93(R), is defined to be the smallest system of subsets of A, 
closed under finite intersections and arbitrary unions, and containing all 
open rays, that is, subsets of the form 


{x € A: xRa} or {x € A: aRx} 


for fixed a’s in A. We search for ordinal utility functions continuous in the 
topology 3(R). 

(a) Show that if there is an ordinal utility function on (A, R) 
continuous in a topology J on A (and the usual topology on the reals), 
then J must contain all open rays. (Fishburn [1970, p. 36] proves that if J 
contains all open rays and there is an ordinal utility function, then there is 
an ordinal utility function continuous in 9. Hence, in particular, if there is 
an ordinal utility function, there is one that is continuous in the R-order 
topology. This result goes back to Debreu [1954].) 

(b) If (A, R) is a strict simple order, show that (A, R) has a 
countable, order-dense subset if and only if the R-order topology J(R) has 
a countable base. 

(c) Thus, show that a strict weak order (A, R) has a continuous 
ordinal utility function if and only if 3(.R*) has a countable base, where 
R* is the reduction of R. (This result is due to Debreu [1954].) 

(d) By considering the open sets 


{(b, c): (a, 1)R(B, c) R(a, 9)}, 


show that lexicographic preference on Re X Re could not have a count- 
able base in the order topology. For further results on the order topology, 
see Pfanzagl [1968, Chapter 3]. 
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3.2 Extensive Measurement 


3.2.1 Hdélder’s Theorem 


A second case of fundamental measurement which is of interest is that 
where the relational system is (A, R, 0), where R is a binary relation on A 
and 0 is a binary operation. One seeks conditions on (A, R, 0) (necessary 
and) sufficient for the existence of a real-valued function f on A satisfying 
(3.1) and 


f(a b) = f(a) + f(b), (3.8) 


or conditions on (A, R, 0) (necessary and) sufficient for the existence of a 
real-valued function f on A satisfying (3.3) and (3.8). These representations 
arose in our study of mass and of preference where we want a utility 
function to “preserve” combinations of objects. To avoid redundancy, we 
shall state results carefully only for the former representation. That is, we 
seek conditions on (A, R, 0) (necessary and) sufficient for the existence of 
a homomorphic map into (Re, >, +). 

Attributes that have additive properties, such as mass for example, have 
traditionally been called extensive in the literature of measurement, and so 
the problem of finding conditions on (A, R, 0) (necessary and) sufficient 
for the existence of a homomorphic map into (Re, >, +) is called the 
problem of extensive measurement. 

The theory of extensive measurement will naturally overlap with ab- 
stract algebra, which often deals with relational systems of the form (A, 0). 
We review a few of the relevant concepts dealing with such relational 
systems.* If 0 is an operation on the set A, the pair (A, 0) is called a group 
if it satisfies the following axioms:+ 


Axiom G1 (Associativity). For all a, b,c in A, (a0b)0c =ao0(b Oc). 


Axiom G2 (Identity). There is an (identity) element e in A such that for 
allainA,aQe=e0a=a. 


Axiom G3 (Inverse). For all a in A, there is an (inverse) element b in A 
such thataob=boa=e. 


The reader is familiar with many examples of groups. (Re, +) is an 
example, with the identity e of Axiom G2 being 0 and the inverse 5 of 


*The reader familiar with group theory may wish to skip directly to the statement of 
Holder’s Theorem below. 

fOften, an additional axiom called Closure is explicitly stated. This axiom asserts that for 
all a, b in A, a 0b is in A. This is implicit in the definition of an operation. 
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Axiom G3 being —a. (Re*, X) is a group, with the identity being 1 and 
the inverse of a being 1/a. (Re, X) is not a group, since the only possible 
identity is 1, and the element 0 has no inverse b in Re such that 
Oxb=bxX0=1. 

If (A, 0) is a group, we may define na for every positive integer n. The 
definition is inductive. We define la to be a. Having defined na, we define 
(n + l)a to be aona. The definition makes sense whenever (A, 0) is 
associative. 

We have seen in the previous section that obtaining axiomatizations for 
homomorphisms into the reals can involve translating properties of the real 
number system into an abstract relational system. An example was the 
translation of the existence of a countable order-dense subset, the ration- 
als, into an axiom for the representation of (A, R) into (Re, >). Most 
axiomatizations in measurement theory try to capture an important prop- 
erty of the real numbers called the Archimedean property. This property can 
be defined as follows: If a and 5 are real numbers and a > 0, then there is 
a positive integer n such that na > b. That is, no matter how small a might 
be or how large b might be, if a is positive, then sufficiently many copies of 
a will turn out to be larger than b. This property of the real number system 
is what makes measurement possible; it makes it possible to roughly 
compare the relative magnitudes of any two quantities a and b, by seeing 
how many copies of a are required to obtain a larger number than b. We 
shall try to translate the Archimedean property of the reals into an 
Archimedean axiom in (A, R, 0) in order to axiomatize the representation 
(A, R, 0) into (Re, >, +). 

Sufficient conditions for extensive measurement, the representation 
(A, R, 9) into (Re, >, +), were first given by Hélder [1901]. We state 
some conditions closely related to those originally given by Hdlder, by 
giving the following definition. A relational system (A, R, 0) is an Archi- 
medean ordered group if it satisfies the following axioms: 


Axiom Al. (A, 9) is a group. 
AXIOM A2. (A, R) is a Strict simple order. 


Axi1oM A3 (Monotonicity). For all a, b,c in A, 
aRb iff (aoc)R(boc) iff (coa)R(c ob). 


Axiom A4 (Archimedean). For all a, b in A, if aRe, where e is the identity 
for (A, 9), then there is a positive integer n such that naRb. 


The paradigm example of an Archimedean ordered group is of course 
(Re, >, +). We know that (Re, +) is a group and (Re, >) is a strict 
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simple order. Monotonicity follows, since 


a>b iff atc>bt+c iff cta>ctb. 


The Archimedean axiom follows directly from the Archimedean property 
of the real numbers. (Re*, >, <) is another example of an Archimedean 
ordered group. Hélder’s Theorem is the following: 


THEOREM 3.5 (H6lder).* Every Archimedean ordered group is homomor- 
phic to (Re, >, +). 


Proof. Omitted. See Krantz et al. [1971] for a proof. 


It should be noted that every homomorphism of an Archimedean ordered 
group into (Re, >, +) is one-to-one, that is, is an isomorphism. This 
follows since (A, R) is strict simple. 

The axioms for an Archimedean ordered group give a representation 
theorem for extensive measurement. As such, they should be tested for 
various examples. Let us begin with the case of mass. Here, A is a 
collection of objects, aRb is interpreted to mean that a is judged heavier 
than 5, and ao b is the combined object. Let us consider first the group 
axioms. To say (A, 0) is a group requires that 0 be an operation, so in 
particular we have to make sense out of a 0a, a O(a 0a), etc. This is the 
problem we discussed at the end of Section 1.7. Things work out in theory 
if we imagine an infinite number of ideal copies of each element in the 
group. However, it is still necessary to make sense out of complicated 
combinations like a 0 [b 0 (c 0 d)]. Of the explicit group axioms, certainly 
associativity seems to make sense: taking first the combination of a and b 
and then combining with c amounts to the same object (as far as inass is 
concerned) as first combining b and c and then combining with a. Also, at 
least ideally, there is an element with no mass, which could serve as the 
identity e. Speaking of inverses, however, does not make sense. Given an 
object a, the axiom requires that there be another object 5, which, when a 
and 5 are combined, results in the identity, the object with no mass. There 
is no such b. Thus, to obtain a usable representation theorem for measure- 
ment of mass, it is necessary to modify this axiom. 

Let us next consider the remaining axioms for an Archimedean ordered 
group. It might be reasonable to assume that (A, R) is a strict simple 
order, although we can run into problems with completeness: two different 


*It should be remarked that from Holder’s Theorem it follows that (A, 0) is commutative, 
that is, that ao b = b Oa, all a, b in A. It is surprising that commutativity does not have to 
be assumed. (The second “iff” in Axiom A3 plays the role of a commutativity axiom, though 
this axiom may be weakened to read: if aRb, then (a 0 c)R(b Oc) and (c 0 a)R(c 0 B).) The 
author knows of no simple direct proof of commutativity from the axioms for an Archi- 
medean ordered group. 
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objects may be close enough in mass so that we cannot distinguish them. 
Monotonicity probably is reasonable; if you think a is heavier than b, then 
adding the same object c to each of a and b should not change your 
opinion, and taking c away similarly should not. Moreover, the order in 
which you add c to a and to 5 should not matter. Finally, it is probably 
reasonable to assume the Archimedean axiom: Given an object a that is 
heavier than the object with no mass, and given another object b, if we 
combine enough copies of a, it seems reasonable that, at least in principle, 
we can create an object that is heavier than b. The Archimedean axiom is 
not really empirically testable, in the sense that any finite experiment could 
not get enough data to refute it.* However, one could get very strong 
evidence against the Archimedean axiom, which would amount to a 
refutation. Still, if we accept this axiom, we often treat it as an idealization 
which seems reasonable. 

Let us next consider the same axioms for the case of preference. Here, A 
is a set of objects or alternatives, 0 is again combination, and aRb means 
a is strictly preferred to b. Associativity is probably reasonable if the 
combination does not involve physical interaction between elements. How- 
ever, if such interaction is allowed, then combining a with 5 first and then 
bringing in c might create a different object from that obtained when b 
and c are combined first. To give an example, if a is a flame, b is some 
cloth, and c is a fire retardant, then combining a and b first and then 
combining with c is quite different from combining b and c first and then 
combining with a. However, this result follows only if we allow interaction 
(for example, flame lights cloth) with our combination. Similar examples 
may be thought of from chemistry. It should be noted that in our 
discussion of associativity for the case of mass, we also tacitly assumed 
that combination did not involve physical or chemical interaction between 
alternatives being combined. The axioms of identity and inverse seem to 
be satisfied. For there is at least ideally an object with absolutely no worth 
at all. And given any object a, owing someone else such an object might be 
considered an inverse alternative. For having a and owing a amount to 
having nothing. (If future value is discounted over present value, we would 
have to look for an inverse to having a among the alternatives of owing 
objects of more worth than a.) The strict simple order axiom is one we 
have previously questioned for preference; we have even questioned 
whether preference is a strict weak order, and we probably can question 
completeness: it is possible to be indifferent between two alternatives. Let 
us turn next to monotonicity. Even this axiom might be questioned, if 
objects combined can be made more useful than individual objects. For 


*The axiom could be refuted indirectly. In the presence of the other axioms for an 
Archimedean ordered group, the Archimedean axiom implies commutativity (see footnote on 
page 124). Hence, a violation of commutativity would provide evidence against the Archi- 
medean axiom, given one believes the other axioms. 


126 Three Representation Problems: Ordinal, Extensive, and Difference Measurement 3.2 


example, suppose a is black coffee, b is a candy bar, and c is sugar. You 
might prefer b to a (not liking black coffee), but prefer a 0 c to b 0c. This 
is really an argument against the additive representation, not just the 
monotonicity axiom. Finally, perhaps one can even question the Archi- 
medean axiom. For example, suppose a is a lamp and b is a long, healthy 
life. Will sufficiently many lamps ever be better than having a long healthy 
life? 

The discussion above suggests that it is necessary to modify Hélder’s 
axioms to obtain a satisfactory representation theorem for extensive 
measurement, even if it is only mass we wish to measure. Some early 
attempts to improve Hélder’s Theorem can be found in Huntington 
[1902a,b, 1917], Suppes [1951], Behrend [1953, 1956], and Hoffman [1963]. 
All these improvements involve some axioms that are not necessary, as 
indeed does Hélder’s Theorem. We shall present a set of axioms that are 
necessary as well as sufficient. 


3.2.2 Necessary and Sufficient Conditions for Extensive 
Measurement 


To find a set of axioms that are necessary as well as sufficient for 
extensive measurement, let us see which of Hélder’s axioms are necessary. 
Let us suppose that f is a homomorphism from (A, R, 0) into (Re, >, +), 
and let us begin by looking at the group axioms. Certainly 


[ f(a) + f(b)] + f(c) = f(a) +[ f(b) + f(c)]. 
However, the representation does not imply that 


(ao0b)oc=ao(boc), 
but only that 


[(a0b) oc]E[ao(boc)], 


where E is defined by 
xEy = ~ xRy & ~ yRx. (3.9) 


Second, the representation does not imply that there is an identity element 
e, since there may not be any element e in A such that f(e) = 0. Third, the 
same is true of the inverse. Thus, of the group axioms, only a weak version 
of associativity is necessary.* Next, turning to the condition that (A, R) be 


*Implicit in the group axioms is that © is an operation, that is, that a o 5 is defined for all 
a, b in A. This is not a necessary condition for the representation, and it might make sense to 
weaken it. In the case of mass, for example, we might limit combination only to objects whose 
combination fits into our lab! The approach to extensive measurement where combination is 
restricted can be found in Luce and Marley [1969] and Krantz et al. [1971, Section 3.4]. 


3.2 Extensive Measurement 127 


strict simple, we already know from the theorems of Section 3.1 that, since 
aRb iff f(a) > f(b), (A, R) must be strict weak. It does not necessarily 
follow that (A, R) is strict simple. The next axiom, monotonicity, is 
necessary. For, 


aRb = f(a) > f(b) = f(a) + f(c) > f(b) + f(c) = (a0 c)R(b Oc). 


Similarly, aRb =(c0a)R(c 0b). Finally, the Archimedean axiom as 
stated cannot be necessary, for there is not necessarily an identity e in A. 
However, the Archimedean axiom can be restated as a necessary axiom. 
The assumption aRe amounted to saying that f(a) was positive. This is the 
same as saying that 2f(a) > f(a), which, since f is additive and preserves 
R, is the same as saying that 2aRa. Thus, the Archimedian axiom can be 
restated as follows: if a and b are in A and 2aRa, then there is a positive 
integer n such that naRb. (The notation na makes sense so long as the 
weak form of associativity holds.) In this form, the Archimedean axiom is 
necessary. For 2aRa implies 2f(a) > f(a), or f(a) > 0. Thus, by the 
Archimedean property for the reals, there is a positive integer n such that 
nf (a) > f(b). Since f is additive, f(na) > f(b), and naRb. Summarizing, the 
following conditions are necessary for extensive measurement: 


AxIoM El (Weak Associativity). For all a, b, c in A, 
[a0(boc)]E[(aob)oc]. 


AXIOM E2. (A, R) is a strict weak order. 


AxIOM E3 (Monotonicity). For all a, b, c in A, 
aRb = (a 0c)R(b 0c) & (c Oa)R(c Ob). 


AXIOM E4 (Archimedean). For all a, b in A, if 2aRa, then there is a 
positive integer n such that naRb. 


Unfortunately, Axioms El through E4 together are not sufficient for 
extensive measurement. Proof is left to the reader. (Hint: Let A be the 
negative reals supplemented with an element — 0, let R be >, and define 
© to be + except that (—co)O0x and x0(— ©) are always — ©.) 
However, one obtains necessary and sufficient conditions by substituting 
for Axiom E4 a stronger Archimedean axiom: 


Ax10oM E4’ (Archimedean). For all a, b, c, d in A, if aRb, then there is a 
positive integer n such that (na 0 c)R(nb 0 a). 


To see that Axiom E4’ is indeed an Archimedean axiom (that is, it reflects 
the Archimedean properties of the reals) and that it is necessary, let us 
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observe that aRb implies f(a) > f(b), so f(a) — f(b) > 0. Thus, by the 
Archimedean property for the reals, there is a positive integer n such that 
n[ f(a) — f(b)| > f(d) — f(e), that is, nf(a) + f(c) > nf(b) + f(d). From 
this, since f is additive, it follows that f(naoc) >f(nbod), or 
(na 0 c)R(nb 0 d). A system (A, R, ©) satisfying Axioms El, E2, E3, and 
E4’ is called an extensive structure. 


THEOREM 3.6 (Roberts and Luce [1968]). Suppose A is a set, R is a 
binary relation on A and © is a binary operation on A. Then there is a 
real-valued function f on A satisfying 


aRb & f(a) > f(b) (3.1) 
and 


f(a 0b) = f(a) + f(b) (3.8) 


if and only if (A, R, 0) is an extensive structure. 


This theorem was stated as a corollary of a more general result in 
Roberts and Luce [1968]. We have already proved that (A, R, 0) is an 
extensive structure whenever a representation holds. A direct proof of the 
sufficiency of the conditions can be found in Krantz et al. [1971, Theorem 
3.1]. The proof reduces this situation to Hélder’s Theorem. We shall omit 
it. Other theorems giving necessary and sufficient conditions for extensive 
measurement can be found in Alimov [1950] and Holman [1969]. (See 
Exer. 12 below.) 

Let us again consider the axioms for an extensive structure as axioms for 
measurement of mass and of preference. If R is the binary relation 
“heavier than,” then the axioms for an extensive structure are probably 
satisfied, at least ideally. (The only axiom for an Archimedean ordered 
group with which we found serious fault for the case of mass was the 
existence of an inverse.) Suppose next that R is preference. Our discussion 
of the fire retardant casts doubts about weak associativity just as it did 
about associativity, if the combination allows interaction. The discussion 
in Section 3.1 casts doubts about (A, R) being strict weak. The example of 
the coffee, sugar, and candy bar casts doubts about monotonicity. To 
consider the Archimedean axiom, suppose a is one dollar, b is no dollars, c 
is life as a cripple, and d is a long and healthy life. It is conceivable that no 
amount of money will be enough to compensate for life as a cripple, and 
so there might be no 2 such that (na 0 c)R(nb 0 a). 

Extensive measurement is basic to the physical sciences. However, as 
Krantz et al. [1971, pp. 123, 124] point out, the attempt to apply extensive 
measurement to the social sciences usually meets with some sort of 
difficulty. Even if there is an operation available, the axioms for extensive 
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measurement are usually not satisfied. The exceptions mentioned by 
Krantz et al. are risk and subjective probability. We discuss subjective 
probability in Chapter 8. Risk is discussed in a variety of places in the 
literature. One theory of risk, that of Pollatsek and Tversky [1970], can be 
formulated in terms of extensive measurement. The basic idea of the 
Pollatsek—Tversky theory is that one compares various probability distri- 
butions as to riskiness. Comparative riskiness defines the binary relation of 
the theory of extensive measurement. The operation © corresponds to 
convolution; that is, 


(fog)(t) = [1 = Pieteee 


The Pollatsek-Tversky theory of risk has not been empirically verified. 
The representation problem for extensive measurement has recently 
been treated from a probabilistic point of view. See Falmagne [1978]. 


3.2.3. Uniqueness 


Before leaving the topic of extensive measurement, we ask for a unique- 
ness statement. Our earlier observations about measurement of mass 
suggest that the representation f should be unique up to a similarity 
transformation; that is, measurement should be on a ratio scale. We shall 
prove this. This result will have significance for preference as well. For it 
says that if a utility function f satisfies conditions (3.1) and (3.8), then it is 
meaningful to say that the utility of one alternative is twice the utility of a 
second, or half the utility of a second, and so on. If this is the case, we can 
begin to use utility functions to make “quantitative” decisions, rather than 
just “qualitative” ones. 


THEOREM 3.7. Suppose A is a non-empty set, R is a binary relation on A, 
© is a (binary) operation on A, and f is a real-valued function on A 
Satisfying 


aRb = f(a) > f(b) (3.1) 
and 
f(a ob) = f(a) + f(b). (3.8) 


Then % = (A, R, 0) > B = (Re, >, +) is a regular representation, and 
(4, B, f) is a ratio scale. 


Proof. That A — B is a regular representation follows from Theorem 2.1. 

To show that (2, 8, f) is a ratio scale, suppose first that for some a > 0, 
(x) = ax for all x in f(A). Clearly ¢ 0 f satisfies (3.1) and (3.8), and so @ 
is an admissible transformation. 
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Conversely, suppose $:{(A) > Re is an admissible transformation. We 
shall show that for some a > 0, $(x) = ax, all x € f(A). Let g =oo/f. 
We show first that f(a) > 0 iff g(a) > 0. For since f and g satisfy (3.1) and 
(3.8), 


f(ay>O0 iff f(a) + f(a) > f(a), 
iff f(aoa)> f(a), 
iff (a 0 a)Ra, 
iff g(a 0a) > g(a), 
iff g(a) + g(a) > g(a), 
iff g(a) > 0. 


A similar proof shows that f(a) < 0 iff g(a) < 0. 

If f(a) = 0 for all a in A, any positive a will suffice to satisfy ¢(x) = ax 
for all x in f(A). Thus, we may assume that for some e in A, f(e) # 0. We 
shall assume that f(e) > 0. The proof in the case that f(e) < 0 is similar. 
Since f(e) > 0, g(e) must be > 0. 

Pick a such that g(e) = af(e). Since f(e) and g(e) are positive, so is a. 
We shall show that g(a) = af(a), for all a in A. This proves that (x) = ax 
for all x in f(A). The proof that g(a) = af(a) proceeds by contradiction, 
assuming first that g(a) < af(a) and second that g(a) > af(a). The proof 
is similar in the second case, and so is left to the reader. 

If g(a) < af(a), then 


a(a) _ g(a) _ af(a) _ f(a) 
gle) afte) af(e) fle)” 


Since between any pair of real numbers there is a rational, it follows that 
there are a pair of positive numbers m and n such that 


8(a) _ m — f(a) 
ae) Hey (3.10) 


The second inequality of (3.10) implies that mf(e) < nf(a), so f(me) < 
(na), so naRme. But then g(na) > g(me), so ng(a) > mg(e), so 


g(a) [ m 
ge) a 


This contradicts the first inequality of (3.10) and shows that g(a) < af(a) 
is impossible. a 
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3.2.4 Additivity 


Before closing this section, we make two remarks about operations in 
general and additivity in particular. First, in spite of historical emphasis on 
additivity, there is nothing magic about the addition operation. Indeed, if f 
satisfies (3.1) and (3.8), then g = e/ satisfies (3.1) and 


g(a ob) = g(a)g(b). (3.11) 


Thus a multiplicative representation (3.1), (3.11) can also be obtained, with 
positive g. Conversely, if a multiplicative representation (3.1), (3.11) can be 
obtained with positive g, then f = Ing gives an additive representation 
(3.1), (3.8). Moreover, it is easy to see that the logarithm of the multiplica- 
tive representation gives rise to the same type of scale as the additive 
representation, so the same comparisons using In g are meaningful as 
would be using f. The main point is that we make use of a representation 
to learn about empirical relations. We get just as much information out of 
the representation (3.1), (3.11) as we do out of the representation (3.1), 
(3.8). 

As a second point, let us note that, as we suggested above, there are very 
few genuine empirical operations in the social sciences. Perhaps the most 
interesting one is the bisection operation, which arises when a subject is 
asked to produce a stimulus which he thinks is halfway between two given 
stimuli, for example, with respect to loudness or with respect to brightness. 
Suppose ao b is defined to be the unique element that “bisects” a and b. 
Suppose R is a binary relation on the set of objects being compared, with 
aRb interpreted to mean that a is judged higher (louder, brighter, etc.) than 
b. Then we seek a real-valued function f on A such that for all a, b in A, 


aRb = f(a) > f(b) (3.1) 


and 
f(aob) = fay + 0) (3.12) 


We shall study this representation in Exer. 15. 
Exercises 
1. (a) Suppose A = {0,1}, R =>, and © is defined by 
000=0,001=100=101 1. 
(The operation o is Boolean addition.) Show the following: 


(i) (A, 0) is associative. 
(ii) (A, 0) has an identity. 
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(iit) Not every element of A has an inverse. 

(iv) (A, R) is a strict simple order. 

(v) (A, R, ©) violates monotonicity. 

(vi) (A, R ©) violates the Archimedean axiom, Axiom A4. 
(vii) (A, R, 0) violates Axiom E4’. 
(viii) (A, R, 9) satisfies Axiom E4. 

(b) Make a similar analysis if 0 is modified as follows: 

000=001=100=0,1o0l1l=1. 

2. Show that if (A, R,0) is homomorphic to (Re, 2, +), then 
(a 0 b)R(6 0a) for all a, b in A. 

3. (a) Neither of the following relational systems is an Archimedean 
ordered group. Which of the axioms for an Archimedean ordered group 
holds in each case? 

(i) (Ret, <, +). 
(i) ~({0, 1, 2}, >, + (mod 3)). 

(b) Which of the axioms for an Archimedean ordered group holds 
in each of the following relational systems? (Q* is the positive rationals, 
Re~ the negative reals.) 

(i) (Re*, <, X). 
(ii) (N, >, +). 
(iii) (Q, >, X). 
(iv) (Q*, >, x). 
(v) (Re, <, +). 

4. Suppose A = Re X Re, R is the lexicographic ordering of the plane, 
and (a, b) 0 (c, d) is defined to be (a + c, b + d). 

(a) Show that it follows from Hélder’s Theorem that (A, R, ©) is not 
an Archimedean ordered group. 

(b) Thus, one of the axioms for an Archimedean ordered group 
must be violated. Which one? 

5. (a) Show that (Re*, <, +) is an extensive structure. 

(b) Which of the axioms for an extensive structure hold for each 
relational system in Exer. 3b? 

6. (Krantz, et al. (1971, p. 77]) Which of the axioms for an extensive 
structure are satisfied by the following relational systems (A, R, 0)? 

(a) A=N,aRbiffat+1>baob=at+b+2. 

(b) A = Re*, aRb iff a > b,aob = max{a, b} + tmin{a, 5}. 

(c) A= {x} U N,aRb iff [((@ > 6 and a,b EN) or (a =x and 
bEN)), 


+b if aabEN, 


aob= 


me Vo 
aoe 
Q 
I 
> 
I 
= 


(d) (A, R, 0) as in Exer. 4. 


7. Show that Axioms El through E4 together are not sufficient for 
extensive measurement. 
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8. Suppose f satisfies extensive measurement. Show that the following 
statements are meaningful: 
(a) f(a) > 7.6f(5). 
(b) f(a) — f(b) > f(c) — f(d). 
(c) f(a)/f(b) = 10. 


9. Suppose f:4A > Re* and g:A — Re™ satisfy (3.1) and (3.11). 
Show that 


f(a) >1 iff g(a) >], 


10. Give examples to show that none of the group axioms are necessary 
for extensive measurement. Specifically, give examples of relational sys- 
tems (A, R, 0) homomorphic to (Re, >, +) but which violate 

(a) associativity (Axiom G1); 
(b) identity (Axiom G2); 
(c) inverse (Axiom G3). 

11. Suppose (A, R) is a strict weak order and (A, 0) is associative. 
Suppose that every element of A satisfies 2aRa (such an element is called 
Positive). Show that 4a £a and A must be infinite. 


12. Suppose (A, R, 0) satisfies the first three axioms for an extensive 
structure. A pair of elements a and 5 in A is called anomalous if either aRb 
or bRa and either for all positive integers n, 


naR(n+1)b and nbR(n + 1)a 
or for all positive integers 7, 
(n+ 1)bRna and (n+ 1)aRnb. 


Show that if (A, R, 0) is homomorphic to (Re, >, +), then there is no 
anomalous pair. (Alimov [1950] proved that Axioms E], E2, and E3 of an 
extensive structure plus the assumption that there are no anomalous pairs 
provide necessary and sufficient conditions for extensive measurement.) 

13. (Roberts and Luce [1968]) (A, R, 0) satisfies weak solvability if, for 
all a,b € A, 


~ aRb = (3c € A)[bR(a 0c)]. 


(A, R, 0) satisfies positivity if every a in A is positive, that is, it satisfies 
2aRa. Show that if (A, R, 0) satisfies Axioms El, E2, and E3 of an 
extensive structure and positivity, then weak solvability and the standard 
Archimedean axiom (Axiom E4) imply the Archimedean Axiom E4’ of an 
extensive structure. 

14. Suppose (A, R, 0) is homomorphic to (Re*, >, +), via a homomor- 
phism g. Fix e in A and define 


S(a,e) = {m/n:m,ne&N and ~ meRna}, 
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where N is the set of positive integers. Show the following: 

(a) S(a, e) is nonempty. 

(b) S(a, e) is bounded above. 

(c) If f(a) is the least upper bound of the set S(a, e), then f is a 
homomorphism from (A, R, 0) into (Ret, >, +). (To show this, consider 
the relation between f and g.) 

Note: Under assumptions similar to H6lder’s, Suppes [1951] and 
Suppes and Zinnes [1963] use f to give a constructive proof of the existence 
of an extensive measure. A key additional assumption needed for the proof 
is that (A, R, 0) satisfies positivity (Exer. 13). 

15. In this exercise we study the binary operation of bisection. We 
assume that the structure (A, R, 0) is given, with R a binary relation on A 
and 0 a binary operation (bisection), and study the representation (3.1), 
(3.12). In point of fact, ao b might not be the same as boa, or even 
judged equal in strength (loudness, brightness, etc.), as has been observed 
by such writers as Stevens [1957]. Such order biases are called hysteresis. 
As a result of these biases, the representation (3.12) might not be ap- 
propriate, and we replace it with the representation 

f(a 0 b) = 8f(a) + (1 — 8)f(d), (3.13) 
where 6 & (0, 1). If 6 #4, there is an order bias. The representation (3.1), 
(3.13) has been studied by Pfanzagl [1959a, b]. The specific representation 
(3.1), (3.12) is studied by Krantz, ef al. [1971, Section 6.9.2], and a more 
general representation than (3.1), (3.13) is studied by Krantz et al. [197], 
Section 6.9.1]. Pfanzagl’s representation theorem involves four axioms. The 
following are three of Pfanzagl’s axioms for the representation (3.1), (3.13). 
(The fourth axiom is a topological axiom, making precise the statement 
that a o b is continuous in both of its arguments.) 

(a) Reflexivity: aoa =a. 

(b) Monotonicity: aRb => (aoc)R(b 0c). 

(c) Bisymmetry: (a 0 b) 0 (cod) =(aoc)o(b04d). 

For each axiom, determine if it is a necessary condition for the representa- 
tion and, if not, modify the axiom so that it is necessary. 

Note: For an alternative approach to bisection, see Exer. 12, Section 3.3. 

16. (a) Show that if (A, R, 0) is an extensive structure, then Eq. (1.24) of 
Chapter | is satisfied and hence (A, R, 0) is shrinkable and the reduction 
(A*, R*, 0*) of Section 1.8 is well-defined. 

(b) Construct an extensive structure (A, R, 0) which is not isomor- 
phic to its reduction. 


3.3 Difference Measurement 
3.3.1 Algebraic Difference Structures 
Let A be a set and let D be a quaternary relation on A. In discussing 


temperature differences in Section 3.1, we interpreted D(a, b, s, ft) or, 
equivalently, abDst, to be the statement that the difference between the 
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temperature of a and the temperature of b is judged to be greater than that 
between the temperature of s and the temperature of t. We then encoun- 
tered the representation 


abDst <= f(a) — f(b) > f(s) — f(x). (3.4) 


This is called (algebraic) difference measurement. A similar representation 
makes sense in the case of preference, if abDst is interpreted to mean that I 
like a over b more than I like s over ¢. In this case, f is a different kind of 
utility function. In the present section, we shall state two representation 
theorems for the representation (3.4), and a uniqueness theorem. The 
uniqueness theorem is the focal point here, and the reader may wish to 
concentrate primarily on this theorem. 

The relation D may be obtained by a variant of a pair comparison 
experiment, using pairs of pairs of alternatives. Alternatively, D may be 
obtained by asking a subject to make numerical estimates of absolute 
differences. For example, Table 3.9 shows such numerical estimates 
A(x, y). Suppose we define 5(x, y) as follows: 


0 if x = y or x and y are judged equally warm. 
8(x, y) = 4 A(x, y) if x is judged warmer than y, 
—A(x,y) if y is judged warmer than x. 


Then D can be defined by 
abDst = 8(a, b) > &(s, t). (3.14) 


We now ask whether or not the numbers 4(x, y) or A(x, y) could have 
arisen as differences of temperatures, that is, whether or not there is a 
function f so that 


8(a, b) > 8(s, t) = f(a) — f(b) > f(s) — f(t). (3.15) 


This is the same as asking if there is a function f satisfying Eq. (3.4). 


Table 3.9. Judgments of Absolute Temperature Difference 


Estimated Absolute 
Temperature Difference 
Objects Compared, x, y Warmer Object A(x, y) 
a,b a 4 
a,c a 2 
a,d a 12 
b,c b 4 
b,d b 8 
c,d c 4 
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To state a representation theorem for the representation (3.4), we in- 
troduce five defining axioms for an algebraic difference structure (A, D). 
We shall use the notation 


abEst = | ~ abDst & ~ stDab| (3.16) 


and 


abWst = [abDst or abEst]. (3.17) 


The first three axioms are the following: 
AxIoM D1. Suppose R is defined on A X A by 
(a, b)R(s, t) <> abDst. 
Then (A X A, R) is strict weak. 
AXIOM D2. For all a, b, s, t € A, if abDst, then tsDba. 


AXIOM D3. For all a, b, c, a’, b’, c’ © A, if abWa'b’ and bcWb'c’, then 
acWa'c’. 


Axioms D1, D2, and D3 are clearly necessary conditions for the repre- 
sentation (3.4). To see that Axiom D1 is necessary, define g:A X A— Re 
by 


g(a, b) = f(a) — f(b). 
Then 


(a, b)R(s, t) = g(a, b) > g(s, t). 


Proof of the necessity of Axioms D2 and D3 is left to the reader. Axiom 
D3 is sometimes called Weak Monotonicity. It is violated by the data of 
Table 3.9, in the sense that if D is defined from this data by Eq. (3.14), 
then (A, D) violates Axiom D3. This is the case because abWbe and beWcece 
but not acWbc. [It is clear that for this data, there can be no function 
satisfying Eq. (3.15), because we have A(a, b) > 0, A(b, c) > 0, but A(a, c) 
< A(B, c).] 
The fourth axiom reads as follows: 


Axiom D4. For all a, b, s, t © A, if abWst holds and stWxx holds, then 
there are u,v in A such that auEst and vbEst. 
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Axiom D4 is often called a solvability condition. The assumption stWxx 
says that f(s) — f(t) 2 0. The assumption abWst says that 


F(a) — f(b) 2 f(s) — F(). 

Then Axiom D4 says that we can find u and v which “solve” the equations 
F(a) — fu) = f(s) — 

and 
F(v) — f(b) = f(s) ~ (2). 


Axiom D4 is not a necessary condition. To give an example, let A = 
{0, 1, 3}, and let D on A be defined by 


xyDuv x —y >u-v. (3.18) 


Then f(x) = x is a function satisfying Eq. (3.4). But (A, D) does not 
satisfy Axiom D4. For 3,1W1,0 holds* and 1,0W0,0 holds, but there are 
no u and wv in A such that 3,uE1,0 and v,1£1,0 both hold. For 3,uE 1,0 
implies u = 2. 

To state our fifth axiom, let a,, a,...,4;,... be a sequence of ele- 
ments from A. It is called a standard sequence if a,,,a,Eaza, holds for all 
4;, 4;,, in the sequence and a,a,Ea,a, does not hold. The idea of a 
standard sequence is that the difference between two successive elements is 
the same nonzero amount. This follows from the representation (3.4), since 


G;44,Ea,a, = f(a;41) — f(a) = f(a2) — f(a), 
so for all i and j, 
F(G% 41) — A(G) = H(G+) — F(a); 
and since ~ a,a,Ea,a, implies f(a,) — f(a,) ¥ 0. 


A standard sequence is called strictly bounded if there are s, ¢ in A such 
that stDa,a, and a,a, Dts for all a; in the sequence. That is, 


F(t) — f(s) < f(a) — flay) < f(s) — FM) 


for all i in the sequence. Our fifth axiom is: 


Axiom DS. Every strictly bounded standard sequence is finite. 


*We have placed commas here purely to separate elements in the relation W. 
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Axiom D5 is an Archimedean condition, and it is a necessary condition 
for the representation (3.4). For if a,,a@,,... is a strictly bounded stan- 
dard sequence, then f(a;,,) — f(a) = f(a,) — f(a), for all i. Thus, 


F(a;) — f(a,) = (i — D[ F(a) — F(a,)}. 


Since a,a,Ea,a, does not hold, f(a,) — f(a,) # 0. Thus, if the sequence is 
infinite, then, depending on whether f(a,) — f(a,) is positive or negative, 
the Archimedean condition on the reals implies that for all s, ¢, there is an i 
such that f(a,) — f(a,) > f(s) — f(® or there is an i such that f(4) — f(s) 
> f(a,) — f(a). This violates either stDa,a, or a,a,Dts. 


Remark: We shall encounter the idea of standard sequence on a number 
of occasions in this volume. It corresponds to a very practical idea in 
measurement. Before performing a measurement, one usually agrees on 
some degree of precision, say the difference between f(a,) and f(a,). 
Suppose this difference is positive. Then we are willing to make errors in 
measurement up to this degree of precision. We construct a standard 
sequence a,,a,,... by repeating this fixed difference over and over 
again. Given any “positive” difference f(s) — f(t) larger than the degree of 
precision, that is, so that stDa,a,, we find an i so that 


stWa,a, and a;,,a,Wst. 


Now we can assert that the difference f(s) — f(t) is somewhere between 
the difference f(a;) — f(a,) and the difference f(a;,,) — f(a,), and this 
measurement is within the desired degree of precision. Standard sequences 
also arise in extensive measurement. Here, we fix any a so that 2aRa. A 
standard sequence is a set 


{na: nel}, (3.19) 


where J is any consecutive set of integers. An Archimedean axiom for 
extensive measurement says that any strictly bounded standard sequence is 
finite, where the standard sequence (3.19) is strictly bounded if there is a b 
so that bRna for all na in the sequence. An equivalent axiom says the 
following. Given a € A with 2aRa and given b € A so that bRa, there are 
n and n + | so that bSna and (n + 1)aSb, where xSy = ~ yRx. Thus, we 
can assert that the measure f(b) is between nf(a) and (n + 1)f(a). If a is 
chosen so that f(a) is small enough, we can obtain f(b) to within any 
desired degree of precision. In the case of measurement of mass, we have a 
set of standard weights in most laboratories. Combinations of these 
weights can be used to form standard sequences. Similarly, in the measure- 
ment of length, the ruler defines standard sequences (up to } inch, 7 inch, 
etc.) 


A relational system (A, D) satisfying Axioms D1 through DS is called 
an algebraic difference structure. 
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The reader should consider the “reasonableness” of the axioms for an 
algebraic difference structure, for the cases where D stands for comparison 
of temperature differences and where D stands for comparison of prefer- 
ences. He also should consider the testability of the axioms, in particular 
Axioms D4 and D5. 

We now state our first representation theorem for algebraic difference 
measurement. 


THEOREM 3.8 (Krantz. et al. [1971]). If (A, D) is an algebraic difference 
structure, then there is a real-valued function f on A so that for all 
a,b,s,t EG A, 


abDst = f(a) — f(b) > f(s) — f(2). (3.4) 


We omit the proof of this theorem. Earlier sets of sufficient conditions for 
algebraic difference measurement were given by Suppes and Winet [1955], 
Debreu [1958], Scott and Suppes [1958], Suppes and Zinnes [1963], and 
Kristof [1967]. The only known set of axioms necessary as well as sufficient 
for difference measurement is due to Scott [1964], and requires the assump- 
tion that A be finite. We state Scott’s Theorem in the next subsection.* 


3.3.2 Necessary and Sufficient Conditions 

THEOREM 3.9 (Scott). Suppose A is a finite set, D is a quaternary relation 
on A, and E and W are defined by Eqs. (3.16) and (3.17). Then the following 
conditions are necessary and sufficient for there to be a real-valued function f 


on A satisfying, for all a, b, s,t © A, 


abDst = f(a) — f(b) > f(s) — f (2). (3.4) 


Axiom SDI. abWst or stWab, all a,b,s,t EA. 


Ax1OM SD2. abDst = tsDba, all a,b,s,t EG A. 


Axiom SD3. If n > 0 and m and o are permutations of {0, 1,...,n — 1}, 
and if a,b, Wa, ba) holds for allO <i <n, then a,@byq)Wagby holds; this is 
true for all ap, ay, . .. » Ay, Bg, by, «+b, E A. 


Axioms SD1 and SD2 are clearly necessary. To illustrate Axiom SD3, let 
A = {a, b} and letn = 2. Let ay = b, a, = a, by = a, b, = b. Suppose 7 is 


*A referee has pointed out that a variant of Scott’s Theorem is valid for A of arbitrary 
cardinality, except that in the infinite case the representation is non-Archimedean in general. 
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the identity permutation, the permutation that takes 0 into 0 and | into 1, 
and o is the permutation that takes 0 into 1 and 1 into 0. Axiom SD3 says 
that if a,b,Wa,(b,q) holds, then a,,9)5,(9) Wagby holds. Thus, a,b, Wa,bo 
implies agb, Wagby, or abWaa implies bbWba. This result is clearly neces- 
sary, for 


f(a) — f(b) 2 f(a) — f(a) = 0 
implies 
0 = f(b) — f(b) 2 f(b) — f(a). 


Axiom SD3 is really an “infinite schema” of axioms. It is necessary to state 
one axiom for every n. (Even if A has only finitely many elements, there is 
need for infinitely many axioms, since the a, and b, do not need to be 
distmct.) To show that Axiom SD3 is necessary in general, let us observe 
that if f is any real-valued function on A, then 


n-1 n- n—-] n- 
"E f(a) ~ "3, $() = "E f(aqo) ~ "E, f(ba0) 


Thus, 
"S.[(a) ~ £(6)] = "3. [4leme») ~ Fae) 820) 


If f satisfies (3.4), then the hypothesis of Axiom SD3 says that 
f(a) — f(4)) 2 Flaniy) — f(b); 

for all 0 <i <n. Now equality in (3.20) implies 
F(4,¢0)) = F (boo) 2 f (40) — F (bo); 


$0 4,020) Wagbo. Sufficiency of Axioms SD1 through SD3 involves a 
clever argument which uses the well-known separating hyperplane theo- 
rem. We refer the reader to Scott’s paper for details. 

Before leaving this section, let us observe how the data of Table 3.9 
violates Axiom SD3. Let ay = b, a; = a, a, = b, by = c, b, = b, and 
b, = c. Let 7(0) = 1, a(1) = 0, 2(2) = 2, o(0) = 0, o(1) = 2, and o(2) = 1. 
Then, according to Axiom SD3, 


abWbc & boWbb => acWhc. 


This condition is violated by the data of Table 3.9. 
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3.3.3 Uniqueness 


Next, we turn to a uniqueness theorem. We define a quaternary relation 
A on Re by 


xyAuv x —y >u- v. (3.21) 


THEOREM 3.10. Suppose A is a nonempty set, D is a quaternary relation on 
A, and f is a real-valued function on A satisfying 


abDst <= f(a) — f(b) > f(s) — f(a), (3.4) 


for all a, b,s,t in A. Suppose (A, D) is an algebraic difference structure. 
Then U = (A, D) > 8 = (Re, A) is a regular representation and (A, B, f) is 
an interval scale. 


In the case of preference and utility, this theorem says that if judgments 
of utility difference are sufficiently regular to give rise to a utility function 
satisfying Eq. (3.4), and if certain assumptions about these judgments— 
namely, that (A, D) be an algebraic difference structure—hold, then we 
can make more than just ordinal comparisons with utility. Some writers 
call f a cardinal utility function if f defines a scale as strong as an interval 
scale. 

Before discussing the proof of this theorem, we note that it requires the 
special hypothesis that (A, D) be an algebraic difference structure. In 
particular, since Axioms D1, D2, D3, and DS are necessary axioms for the 
representation (3.4), the required assumption is Axiom D4. To see that 
some assumption is needed, consider A = {0, I, 3} and D as defined on A 
by Eq. (3.18). Then two functions f and g satisfying Eq. (3.4) are given by 
S (x) = x and g(0) = 1, g(1) = 2, g(3) = 8. If o:f(A)— Re satisfies g = 
¢ 0 f, then ¢ is not a positive linear transformation. For suppose there are 
a and B, with a > 0, so that $(x) = ax + B for all x in f(A). Then 
¢(0) = B, so B=1. Now 2=¢(l)=a+B=a+1, so a=1. Thus, 
$3) =a°3+BP=3+1=44 (3), sog#oorf. 

The uniqueness problem for algebraic difference measurement is, there- 
fore, not completely settled by Theorem 3.10. It would be interesting to 
find necessary and sufficient conditions on those (A, D) representable in 
the form (3.4) for the representation to be unique up to a positive linear 
transformation. It would also be helpful to have a systematic treatment of 
the possible admissible transformations that arise in difference measure- 
ment. 

Turning to a proof of Theorem 3.10, we note first that by Theorem 2.1, 
the representation & — % is regular. For 


F(a) = f(b) iff abEaa. 
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Next, suppose f satisfies (3.4) and ¢(x) = ax + B, a > 0, for all x in f(A). 
Then it is easy to see that g(a) = ¢0 f(a) also satisfies (3.4). Finally, 
suppose :f(A) — Re has the property that g = ¢0/f satisfies (3.4). It is 
necessary to show that $(x) is of the form ax + B, a > 0. 

We sketch a proof that works under the additional assumption that for 
all b, x, y in A, there is a c in A so that bcExy. This is a stronger solvability 
condition than Axiom D4. Krantz et al. [1971] have a proof that works 
without this additional assumption. The idea for the following proof goes 
back to Hdlder [1901]. Let 


B = {(a,b) © A X A: abDaa} 
and let R on B be defined by 


(a, b) R(c, d) = abDed. 


By Axiom D1 of an algebraic difference structure, (B, R) is a strict weak 
order. Let (B*, R*) be its reduction. Define an operation 0 on B* as 
follows. Given s and ¢ in B*, let (a, b) be in s, and let (x, y) be in t. Then 
by the additional solvability assumption, there is c in A so that bcExy, that 
is, (b, c) is in t. Define s 0 ¢ to be the equivalence class containing (a, c). It 
is not hard to show that 0 is well-defined. (One part of the proof is to 
show that (a, c) is in B.) Having defined 0, define F:B* — Re by 


F(s) = f(a) — f(b) if (a, d)isins. 


F is well-defined and defines a homomorphism from (B*, R*, 0) into 
(Re, >,+). Given the admissible transformation ¢ of f, let ~: F(B*) > Re 
be defined by ¥(x — y) = o(x) — $(y). Then y is well-defined and an 
admissible transformation of F. It follows from the uniqueness theorem for 
extensive measurement (Theorem 3.7) that there is an a >0 such that 
(x) = ax, all x in F(B*). But then 
(x) — o(y) = ¥(x — y) = ax — ay. 
Hence, for fixed x, in f(A), 
(x) — 6(x9) = ax — axy 
and so 
$(x) = ax + B, 
with B = $(x9) — aXxo. 
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Exercises 


1. Suppose n 2 2, A = {0, 1,2,..., }, and D is defined on A by Eq. 
(3.18). Show that (A, D) is an algebraic difference structure. 

2. (a) If k > 0, let A = {nk: n € N}. Let D be defined on A by Eq. 
(3.18). Show that (A, D) is an algebraic difference structure. 

(b) Which of the following relational systems (A, D) are algebraic 
difference structures? 
(i) A = Z, D defined by Eq. (3.18). 
(ii) A = Re*, abDcd iff a/b >c/d. 
(iii) A = Re*, abDed iff a?/b? > c?/d?. 
(iv) A = {n*: n € N}, abDcd iff a/b > c/d. 

3. Suppose a subject in the laboratory is asked to estimate absolute 
temperature differences of objects he feels, and gives the data in Table 
3.10. Show that the subject is inconsistent with the algebraic difference 
model in the sense that if 6 is defined from A as in Section 3.3.1, there is no 
real-valued function f on the set A of alternative objects considered so that 
Eg. (3.15) is satisfied. 

4. Show that the representation (3.15) is attainable for the subject of 
Table 3.11. 

5. (a) Show that Axiom D2 for an algebraic difference structure is 
necessary for the representation (3.4). 


Table 3.10. Judgments of Absolute Temperature Difference 


Estimated Absolute 
Temperature Difference 
Objects Compared, x, y Warmer Object A(x, y) 
a,b a 3 
a,c a 4 
a,d a 8 
b,c b 3 
b,d 5 5 
c,d c 2 


Table 3.11. Judgments of Absolute Temperature Difference 


Estimated Absolute 
Temperature Difference 
Objects Compared, x, y Warmer Object A(x, y) 
a,b a 2 
a,c a 3 
a,d a 6 
b,c b 2 
b,d 6 3 
c,d c 2 
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(b) Show that Axiom D3 for an algebraic difference structure is 
necessary for the representation (3.4). 

6. (a) Suppose n 2 2, A = {0, 1,2,...,m, r}, where r is not in the set 
{-1, 0, 1,2,...,,n + 1}, and D is defined on A by Eq. (3.18). Show 
that (A, D) is not an algebraic difference structure. 

(b) Show that Y% = (A, D) is homomorphic to 8 = (Re, A), where A 
is defined by Eq. (3.21). 

(c) Show that if r > 2n and if f is any homomorphism from Y into 
8, then (4, B, f) is not an interval scale. 

(d) However, if nm + 1 <r &2nand/f is any homomorphism from 
into B, show that (2f, 8, f) is an interval scale. 


7. Suppose A = {0, 1, 2} and D is a quaternary relation on A. Show 
that the following statements follow from Scott’s axioms: 
(a) If 0,0W0,1 and 1,1W1,2 then 2,0W2,2. 
(b) If 0,1 W0,0, then 0,0 W 1,0. 
(c) If 0,.0W1,2 and 1,2W 1,0, then 0,2W1,2. 
8. Let R be the lexicographic ordering of the plane, and define D on 
A = Re by 


xyDuv = (x, y)R(u, v). 


(a) Show that (A, D) is not an algebraic difference structure. 
(b) Determine which of the axioms D1 through DS are violated. 


9. Suppose (A, D) is an algebraic difference structure and f satisfies 
Eq. (3.4). 
(a) Show that the statement f(a) > f(b) + f(c) is not meaningful. 
(b) Consider the meaningfulness of the following statements: 
(i) f(a) = 2f(6). 
(il) f(a) > f(b) + 100. 
(iii) f(a) — f(b) is a constant. 
10. Suppose (A, D) is an algebraic difference structure. Show the 
following: 
(a) There is a function g:A — Re* so that for all a, b,c, d in A, 


abDed <= g(a)/g(b) > g(c)/g(4). 


(b) The function g defines a regular scale and the admissible trans- 
formations of g are all functions of the form ¢(x) = ax, a, B > 0. (In the 
terminology of Section 2.3, g defines a log-interval scale.) 

(c) The following statements are meaningful: 

(i) g(a) > g(b). 
(ii) g(@)/g(b) > g(c)/a(4). 

Il. (Krantz et al. [1971], Suppes and Zinnes [1963]) Suppose A is a 
nonempty set and D is a quaternary relation on A. Suppose (A, D) 
satisfies Axioms D1 through D3 of an algebraic difference structure. We 
say that a is an immediate successor of b, and write aJb, if abDaa and there 
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is no c so that acDaa and cbDaa. A pair (A, D) is called an equally spaced 
difference structure if it satisfies Axioms D1 through D3 and the following 
condition: 


(aJb & uJv) = abEuv. 


Define J” inductively as follows: 


aJ'b is aJb, 
aJ"*'b holds iff there is c in A so that aJ"c and cJb. 


We shall also use the notation aJ%, to mean abEaa. 

(a) Show that if A = {0, 1, 3}, then (A, A) is not an equally spaced 
difference structure, where A is defined by Eq. (3.21). 

(b) Which of the relational systems (i), (ii), (iii) or (iv) of Exer. 2b is 
an equally spaced difference structure? 

(c) Show that in an equally spaced difference structure, aJ"b and 
n > 0 implies abDaa. 

(d) Show that in an equally spaced difference structure on a finite 
set A, abDaa implies that aJ"b, some n > 0. 

(e) However, show that (d) is false if A is not finite. 

(f) Show that in an equally spaced difference structure on a finite 
set A, if abDaa and cdDaa, then abDcd holds if and only if-there are m, n 
so that 


n>mz2Z1 and aJ"b and cJ"d. 


(g) In an equally spaced difference structure on a finite set A, let e 
be a minimal element in the sense that for all a in A, ~ eaDaa. Define 
f:A > Re by 


_ {0 if aeEaa, 
§(a) i if aeDaa and aJ"e. 


Show that f is a homomorphism from (A, D) into (Re, A). 

(h) Suppose (A, D) is an equally spaced difference structure on a 
finite set A, and f is a homomorphism from (A, D) into (Re, A). Show that 
Jf defines an interval scale. [Hint: Find e so that f(e) is minimal, and 
consider the relation between f(a) — f(e) and f(a) — f(b).] 


12. In Exer. 15 of Section 3.2, we discussed the bisection operation. 
Suppose we do not require that every pair of objects has a bisector. Rather, 
suppose we consider the ternary relation B on the set A defined by 
B(a, b, c) iff b is a bisector of a and c. As Suppes and Zinnes [1963] point 
out, the ternary relation B is related to the quaternary difference relation D 
as follows: 


B(a, b, c) = [~ abDbe & ~ beDab]. (3.22) 
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(a) Suppose Eq. (3.22) holds, suppose (A, D) is an equally spaced 
difference structure (Exer. 11), and suppose f is a homomorphism from 
(A, D) into (Re, A). Show that if A = {x, y, z, w, u} and 


f(x) > f(y) > f(z) >f(w) > f(u), 
then B(x, y, z). 
(b) Identify other triples (a, b, c) for which B(a, b, c) holds in part 
(a). 
(c) Use Eg. (3.22) and the representation (3.4) to state some neces- 
sary axioms on the ternary relation B. 
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CHAPTER 4 


Applications to Psychophysical 
Scaling 


4.1 The Psychophysical Problem 
4.1.1 Loudness 


A sound has a variety of physical characteristics. For example, a pure 
tone can be described by its physical intensity (energy transported), its 
frequency (in cycles per second), its duration, and so on. The same sound 
has various psychological characteristics. For example, how /oud does it 
seem? What emotional meaning does it portray? What images does it 
suggest? Since the middle of the nineteenth century, scientists have tried to 
study the relationships between the physical characteristics of stimuli like 
sounds and their psychological characteristics. Some psychological char- 
acteristics might have little relationship to physical ones. For example, 
emotional meaning probably has little relation to the physical intensity of a 
sound, but rather it may be related to past experiences, as, for example, 
with the sound of a siren. Other psychological characteristics seem to be 
related in fairly regular ways to physical characteristics. Such sensations as 
loudness are an example. Psychophysics is the discipline that studies vari- 
ous psychological sensations such as loudness, brightness, apparent length, 
and apparent duration, and their relations to physical stimuli. It attempts 
to scale or nieasure psychological sensations on the basis of corresponding 
physical stimuli. In this chapter, we shall describe some of the history of 
psychophysical scaling and its applications or potential applications to 
measurenient of noise pollution,* of attitudes, of utility, etc., and we shall 


*The loudness of a sound is different from its disturbing effect. It is this disturbing effect 
that is often called noise. Of course, noise has effects other than just disturbance. It is 
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discuss ways to put psychophysical scaling on a firm measurement- 
theoretic foundation. We shall concentrate on loudness. 

The psychophysical approach to measuring sensations like loudness is 
very different from the fundamental measurement approach we spelled out 
in Chapters 2 and 3. That approach would start with an observed binary 
relation “sounds louder than,” and seek a scale that preserves this relation. 
We shall return to that approach in Chapter 6. 


4.1.2 The Psychophysical Function 


Subjective judgments of loudness are certainly dependent on the inten- 
sity of a sound. Data suggests that such judgments are also dependent on 
the frequency of a sound. Figure 4.1 shows equal-loudness contours, which 
illustrate the fact that sounds at some intensities are judged equally as loud 
as sounds at other intensities at different frequencies. Presumably, the 
duration of a sound also affects judgments of loudness. So does the rise 
time, the time for a sound to rise to maximum intensity. To simplify 
matters, one tries to eliminate all physical factors but one. For example, we 
shall study the relationship between intensity and loudness. To do so, we 
deal with pure tones, sounds of constant intensity at one fixed frequency 
(often taken to be 1000 cycles per second, cps), and we consider the case 
where these tones are presented for a fixed length of time. Then, only the 
intensity is varied. (Alternatively, we could deal with white noise, sounds 
with the same intensity at all frequencies.) 

In principle, the scaling of the loudness of any sound, no matter how 
complex, can be reduced to the scaling of loudness of 1000-cps pure tones. 
For we simply find such a pure tone that gives rise to an equal sensation as 
the original sound (Stevens [1955, p. 825]). In practice, this procedure is 
difficult to carry out. The general case of loudness scaling involves fairly 
complicated procedures which are not a straightforward generalization of 
the procedure we have described. See Kryter [1970] or Stevens [1969] for a 
detailed discussion. 

Suppose J(a) denotes the intensity of a pure tone a. I(a) is proportional 
to the square of the root-mean-square (rms) pressure p(a). The common 
unit of measurement of intensity is the decibel (dB). This is 10 log,9(1/Jp), 


becoming increasingly evident that noise has numerous physiological effects. It obviously can 
affect hearing. More subtly, it has been linked to ulcers, changes in the cardiovascular system, 
possible decrease in fertility, etc. Noise also has psychological effects. Many of these are 
closely related to perceived loudness. However, measurement of noise pollution is different 
from measurement of loudness. For a survey of effects of noise on people, see Environmental 
Protection Agency [1971] or Kryter [1970]. For a summary of alternative ways of measuring 
noise or noisiness, see Kryter [1970]. 
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Figure 4.1. Equal-loudness contours. All the points on a given curve represent tones whose 
loudness is judged equal to that of a particular 1000-cps tone. The number on a given curve 
represents the “sensation level” of the given 1000-cps tone, where a sensation level of 0 is 
threshold, a sensation level of 20 is 20 dB above threshold, etc. This figure is adapted from 
Wever [1949, p. 307} and Krantz et al. [1971, p. 255} with permission of Academic Press and 
the authors. The data was due to Fletcher and Munson [1933]. 


where I, is the intensity of a reference sound. Thus, 


dB(a) = 10 logio[ 1(a)/ Io] = 10 logyo[ p(a)/Po]s 


where py is the rms pressure of the reference sound. (Cf. Kryter [1970] or 
Sears and Zemansky [1955, Section 23-3].) A sound of 1 dB is essentially 
the lowest audible sound.* For reference, some typical environmental 
noises measured in the decibel scale are given in Table 4.1. 


*The reader of the acoustical literature will see notations like dBA, dBC, etc. If sounds 
occur over several frequencies, and we would like to get one number representing their sound 
pressure, we take the average pressure by summing (integrating) over different frequencies. 
Sometimes a weighted average is obtained, with some frequencies weighted more than others. 
For example, since the human ear is more sensitive to certain frequencies, it is considered 
reasonable to use a weighted average with a frequency weighted relative to the human ear’s 
sensitivity to it. This weighting procedure leads to a decibel measure on the so-called A scale, 
denoted dBA. The decibel measure dBC corresponds to yet another weighting procedure, etc. 
(Cf. Kryter [1970, p. 13].) We do not have the problem of distinguishing the different decibel 
scales, since our discussion is limited to sounds of constant frequency. 


Table 4.1. Sound Level in dBA of Some Typical Environmental Noises.* 


Sound Level, Industrial or 
dBA Machine Operatort Community—Outdoors | Home—Indoors 
140 


Painful 


130 
y 
8 120 Oxygen torch (121 dB) 
> 
3 Snowmobile (113 dB) Rock-and-roll 
4 band (108-114 dB) 
8 110 Riveting machine Jet take-off at 1000 ft 
5 (110 dB) (110 dB) 
Textile loom (106 dB) Jet flyover at 1000 ft 
100 Electric furnace (103 dB) 
(100 dB) 
Farm tractor (98 dB) 
Newspaper press (97 dB) 
9 Power mower (96 dB) Rock drill at 50 ft 
3 (95 dB) 
2 Motorcycle at 50 ft 
> (90 dB) 
90 Compressor at 50 ft 
(90 dB) 
Snowmobile at 50 ft 
(90 dB) Food blender 
Milling machine (85 (dB) Power mower at 50 ft (88 dB) 
(85 dB) 
Diesel truck at 50 ft 
(85 dB) 
= Diesel train at 50 ft 
3 (85 dB) 
> 80 Lathe (81 dB) Garbage 
S disposal 
by (80 dB) 
. Clothes washer 
(78 dB) 
Passenger car at 50 ft Dishwasher 
(75 dB) (75 dB) 
70 Air-conditioning Conversation 
unit at 50 ft(60dB) (60 dB) 
60 Large transformer 
Ss 50 at 50 ft (60 dB) 
& 4 
2 30 
= 20 
10 
> 0 


*For meaning of A scale, see footnote, p.151. Data from Department of Public Health, 
State of California [1971]. 
tNote: Unless otherwise specified, listed sound levels prevail at typical operator—listener 
distance from source. 
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Let us call the scale of loudness the psychological scale. (Its unit is known 
as the sone, a term due to S. S. Stevens.*) This is the scale we are trying to 
derive from the physical scale, the scale J of intensity in our case. We shall 
denote the loudness of a sound a by L(a). The measurement of loudness 
now reduces to the following question: What is the relation between the 
psychological scale L and the physical scale J? This relation is called the 
psychophysical law. It is usually stated as a function y which satisfies 
the equation 


L(a) = ¥[ (a) ]. 


The function y used to calculate psychological values from physica] ones is 
called the psychophysical function. In the general situation, if f(a) is a 
physical scale and g(a) is a corresponding psychological scale, and if y is 
the psychophysical function, then for all a, 


g(a) = ¥[ f(a) ]. 


We are in a situation of derived measurement, trying to derive one scale 
from another. The only difference between our present situation and the 
Situations studied in Sections 2.5 and 2.6 is that we do not know the 
function y relating one scale to another. 

A basic goal of psychophysics is to find the general form of the 
psychophysical function y which applies in many different cases of physi- 
cal and psychological scales. This general form is often guessed at from 
large amounts of data, or even derived on the basis of some general 
assumptions. Then specific parameters needed to determine the exact form 
of the function y in a particular application are estimated in context. 

The first attempt to specify the psychophysical function for a large class 
of psychological variables was made by Fechner [1860]. He argued that, 
under reasonable assumptions, the psychophysical law was logarithmic, 
that is, of the form 


¥(x) = a logx + B, (4.1) 


for a and B constant.* Equation (4.1) is called Fechner’s Law. In the case 
of loudness, Fechner’s Law says that L(a) = a log I(a) + B. The decibel 
scale of loudness arises from a special case of Fechner’s Law, where the 
base of log is 10, a = 10, and B = — 10 log), Jp. If L(a) = dB(a), then a 


*A sone corresponds to the loudness of a pure tone of 40 dB at 1000 cps (Stevens [1955)). 

*The base of log may be any number, since the change of base can be incorporated in the 
constant a. A critical assumption Fechner used in arguing for the logarithmic law was that 
you could scale sensations on the basis of variability or confusion or error: if two pairs of 
stimuli are equally often confused, then psychologically they are equally far apart. This 
assumption, which is embodied in the Fechnerian utility model of Section 6.2, has been 
questioned by writers such as S. S. Stevens (see below). 
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doubling of the dB level of a sound should lead to a doubling of the 
perceived loudness. It follows, for example, that a sound of 100 dB should 
sound twice as loud as a sound of 50 dB. Not long after the introduction of 
the dB scale, acoustical engineers noted that this seemed to be false 
(Stevens [1955, pp. 815, 816; 1957, p. 163]). Other data violated even the 
genera form of Fechner’s Law (see below). 

One of the earliest attempts at measuring loudness was that of Fletcher 
and Munson [1933].* They assumed that loudness was proportional to the 
number of auditory nerve impulses reaching the brain. Thus, a sound 
delive ed to two ears should appear to be twice as loud as it is when 
presented to only one ear.t Fletcher and Munson discovered that a tone 
presented to only one ear had to be about 10 dB higher in energy 
(intensity) level than the level of a tone presented to both ears and judged 
equally loud. Thus, they concluded that subjective loudness doubles for each 
10-dB increase in sound pressure.* Hence, Fletcher and Munson suggested 
that an increase from 50 dB to 60 dB should double perceived loudness, 
whereas if decibels measure loudness, the increase would have to be from 
50 dB to 100 dB! The Fletcher—Munson observation has been confirmed 
(at least approximately) by many experiments. See Stevens [1955] for a 
summary of data and experiments. The Fletcher-Munson observation 
implies that there are no a and £ so that for all sounds (pure tones) a, 


L(a) = a logy) (a) + B. 


That is, the Fletcher-Munson observation implies that the general form of 
Fechner’s Law could not hold. For suppose this law were to hold. Then 
certainly a # 0; otherwise L(a) is constant for all a. Moreover, we have 


tThe approach of Fletcher and Munson is discussed in some detail in Kryter [1970, pp. 
247, 255}. 

This is a special case of the general hypothesis sometimes called the /oudness summation 
hypothesis, which says the following. Suppose the loudness of a plus the loudness of x equals 
the loudness of 5b plus the loudness of y. Suppose a is presented to the left ear and 
simultaneously x to the right ear. This should produce a sound equally as loud as when 6 is 
presented to the left ear and simultaneously y to the right ear. We discuss this summation or 
additive hypothesis in our discussion of conjoint measurement in Section 5.4. For reviews of 
this hypothesis, see Hirsch [1948], Reynolds and Stevens [1960], Bekesy [1960], Treisman and 
Irwin [1967], Scharf [1969], Tobias [1972], Levelt, Riemersma, and Bunt [1972], and 
Falmagne, Iverson, and Marcovici [1978]. 

*It has been observed that noise levels in urban areas in the United States have been 
increasing at the rate of approximately 1 dB a year. Thus, it is fair to conclude that, on the 
basis of results like those of Fletcher and Munson, noise levels in our urban areas are 
doubling every ten years (Rienow and Rienow [1967, p. 179].) 


4.1 The Psychophysical Problem 155 


for all a, 
L(a) = alogyy I(a) + B 
= alogio[ 1(4)/Ip | + (B + alogyy Ip) 


= alogyo[ 1(@)/Io] + B’ 
= adB(a) + 8’. 


Suppose dB(b) = dB(a) + 10. By the Fletcher~Munson observation, L(b) 
= 2L(a). Thus, 


adB(b) + B’ = 2[ adB(a) + B’], 
or 
af dB(a) + 10] + B’ = 2[ adB(a) + A’), 
or, using a #0, 
4B(a) = Bene 


That is, for all a, dB(a) is a constant. This is a contradiction. 

Stevens [1957, 1960, 1961a,b,c, and elsewhere] has argued that instead of 
a logarithmic law, the fundamental psychophysical law for loudness and 
many other psychological variables is a power law, 


(x) = ax?, (4.2) 


for a, B constant, a > 0. This idea goes back to Plateau [1872]. We shall 
see in Section 4.3 that the power law is consistent with data like that of 
Fletcher and Munson, at least if the exponent 8 is chosen properly. The 
data of Stevens and his colleagues [1960 and elsewhere] seems to suggest 
that, at least to a first approximation, and sometimes only for limited 
intervals of values of the physical stimuli, the power law holds for more 
than two dozen psychological variables. These psychological variables are 
shown in Table 4.2. In general, these variables seem to be concerned with 
quantitative judgments, like “how much’? Stevens calls such variables 
prothetic continua. Other psychological variables are more concerned with 
qualitative judgments, like “what kind” or “where”? Examples of such 
variables are pitch, apparent azimuth, and apparent inclination. These 
variables are called metathetic continua. The power law may or may not 
hold on metathetic continua (Stevens [1968]). For example, it fails for pitch 
as a function of frequency (Stevens and Volkmann [1940]). However, for 
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Table 4.2. Some Prothetic Psychological Continua and Their Exponents under the 
Power Law* 


Name of 
Psychological Psychological 


Continuum Unit Exponent Conditions 
Loudness Sone 0.3 Binaural, 1000-cps tone 
Loudness 0.27 Monaural, 1000-cps tone 
Brightness Bril 0.33 5° target—dark-adapted eye 
Brightness 0.5 Point source—dark-adapted eye 
Lightness 1.2 Reflectance of gray papers 
Smell 0.55 Coffee odor 
Smell 0.6 Heptane 
Taste Gust 08 Saccharine 
Taste 1.3 Sucrose 
Taste 13 Salt 
Temperature 1.0 Cold—on arm 
Temperature 1.6 Warm—on arm 
Vibration 0.95 60 cps—on finger 
Vibration 0.6 250 cps—on finger 
Duration Chron 11 White noise stimulus 
Repetition rate 1.0 Light, sound, touch, and shocks 
Finger span 13 Thickness of wood blocks 
Pressure on palm 1.1 Static force on skin 
Heaviness Vog 1.45 Lifted weights 
Force of handgrip 1.7 Precision hand dynamometer 
Vocal effort 1.1 Sound pressure of vocalization 
Electric shock 3.5 60 cps through fingers 


*Table adapted from Stevens [1957, p. 166; 1960, p. 236; 1961b]. 


prothetic continua, the power law is accepted quite widely. According to 
Ekman and Sjoberg [1965]: ““As an experimental fact, the power law is 
established beyond any reasonable doubt, possibly more firmly established 
than anything else in psychology.” Still, there is some conflicting evi- 
dence.* 

In summary, the literature of psychophysics has not been and still is not 
in agreement on the general form of the psychophysical law. In the next 


“Pradhan and Hoffman [1963] found violations in the power law by individuals (though 
not by the whole group of subjects, if the data was averaged). Others have found tremendous 
variability in data, both between different individual subjects and for an individual subject 
being retested. (See Luce and Mo [1965], Schneider and Lane [1963], and Stevens and Guirao 
[1964].) Stevens [1957, 1959c, 1961b,c, 1971] argues that this variation is primarily due to 
tandomness in the data, and it averages out over subjects. However, Pradhan and Hoffman 
suggest that the power law is simply an “artifact” of group averaging. And Green and Luce 
[1974] try to make the case that the variations might be “intimately related” to the 
decisionmaking process underlying sensory judgments. Luce [1972] argues that the large 
variation im data makes it hard to speak of psychophysical measurement as analogous to 
physical measurement or, indeed, any sort of fundamental measurement. For other criticisms 
of the power law, see Savage [1970]. 
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section, we discuss a theory that allows us to determine the possible forms 
of the psychophysical law. In Section 4.3, we introduce assumptions that 
allow us to derive the power law, and we discuss some implications of the 
power law, and some of its applications. In Section 4.4, we introduce a 
measurement axiomatization for the key assumption used to justify the 
power law in Section 4.3. 


Exercises 


1. A 40-dB pure tone at 1000 cps receives a loudness rating of | sone. 
According to the Fletcher-Munson observation, what sone ratmg does a 
60-dB pure tone at 1000 cps receive? 


2. (Stevens [1960]) It is possible to introduce a decibel scale for light as 
well as sound. We would define 


Nap = 10 log( E/ Ep), 


where E is the light energy and E, is a reference light energy. The 
brightness of a light a, B(a), is a function y(E(a)). The unit of brightness is 
the bril. If B(a) = Ngp(a@), a halving in the number of brils would corre- 
spond to a halving in the decibel level. Yet, experiments suggest that, as 
with sound, a halving in the number of brils corresponds to a 10-dB 
decrease. Show that this observation even violates the general form of 
Fechner’s Law: 


B(a) = a logy) E(a) + B. 


3. The decibel scale could also be used for vibration. The decibel level 
Ugp would be taken as 10 log,,(Z/£,), with E physical energy transmitted 
and E, a reference energy level. Then a Sugg increase corresponds to a 
doubling of sensation. (See Exer. 6 of Section 4.3.) Show that Fechner’s 
Law again fails to hold. 


4. If a plot of physical versus psychological scales is made in log—log 
coordinates, show that the power law predicts the plot will be a straight 
line. 

5. (Stevens [1960, p. 235]) To most observers, the apparent length of a 
straight line of 100 cm is about twice that of a straight line of 50 cm; that 
of a straight line of 80 cm is about twice that of a straight line of 40 cm; 
and so on. Thus, if the physical scale is length and the psychological scale 
is apparent length, and if the psychophysical function yp is a power 
function, show that Y(x) = ax. (We make a stronger observation in Exer. 
1, Section 4.2.) 


6. Since sound pressure p defines a ratio scale, so does sound intensity. 
However, show that the decibel scale is an absolute scale in the wide sense. 
(It is not a difference scale.) 
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Table 4.3. Median Threshold of Audibility in Decibels* 


Frequency (cps) 
500 1000 1500 2000 3000 4000 6000 
Farmworkers 4 5 4 4 18 33 30 
Office workers 3 5 7 8 15 22 26 
Factory workers 8 9 11 16 35 43 42 


*Data from Glorig et al. [1957]. 


7. The threshold of audibility or hearing level (in decibels) is measured 
for various individuals, and it depends on frequency. (A lower threshold 
means more acute hearing.) In a study reported in Kryter [1970, p. 116], 
Glorig et al. [1957] measured the median threshold of audibility for 
farmworkers, office workers, and factory workers, and reported the data 
shown in Table 4.3. 

(a) Is it meaningful to assert that the median threshold of hearing of 
farmworkers at 1000 cps is better than that of factory workers? 

(b) Is it meaningful to assert that the median threshold of hearing of 
office workers at 2000 cps is twice as high as that of farmworkers, that is, 
that the threshold has been increased (worsened) by 100%? 

(c) If arithmetic means had been calculated instead of medians, 
would the statements of (a) and (b) have been meaningful? 


8. The American Academy of Ophthalmology and Otolaryngology of 
the American Medical Association proposed in Lierle [1959] that the 
percentage of hearing impairment suffered by an individual should be 
measured as follows. (This description disregards how to average in the 
effect of differential hearing loss in two ears.) Measure the threshold of 
audibility (Exer. 7) at the three frequencies 500, 1000, and 2000 cps. 
Subtract a fixed number of dB (15 dB) from each. Average the three 
numbers, and then multiply by 1.5%. If ¢(f) is the threshold of audibility 
of individual i at frequency f, then the impairment of i is given by 


_ (500) — 15 + £,(1000) — 15 + 4(2000) — 15 © 


Imp, 3 


1.5. 


If Imp is to be considered a percentage of hearing impairment, it should be 
meaningful to say that an individual with an Imp of 60 has only 60% of the 
impairment of an individual with an Imp of 100. Is this a meaningful 
assertion? (For a further discussion of measurement of impairment, see 
Kryter [1970, pp. 125ff].) 


4.2 The Possible Psychophysical Laws 
In trying to derive a psychological scale g from a physical scale f, we 


need to determine the psychophysical function y relating g to f. The 
domain and range of y are usually taken to be all of Re or Re*, though 
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real intervals can be used. Thus, it is assumed that all real numbers or 
positive real numbers are (potential) physical and psychological scale 
values. Although the function y may not be known, some of its properties 
may be. For example, we are usually willing to assume that the psycho- 
physical function is continuous. We might also discover that y is additive, 
that is, it satisfies 


Wx + y) = W(x) + Wy), (4.3) 


for all x, y in the domain of y. Equation (4.3) gives a so-called functional 
equation involving the function y. In the next subsection, we indicate the 
continuous solutions to four simple functional equations called the Cauchy 
equations (Cauchy [1821]). Our approach follows that of Aczél [1966]. We 
shall apply the solutions to several of these Cauchy equations to derive 
possible forms of the psychophysical function. 


4.2.1 Excursis: Solution of the Cauchy Equations 


Equation (4.3) is the first Cauchy equation. The remaining Cauchy 
equations are 


W(x + y) = W(x)H(y), (4.4) 
(xy) = Y(x) + Hy), (4.5) 

and 
VOxy) = Y(x)H(y). (4.6) 


THEOREM 4.1. Let Re’ = Re or Re*. Suppose W:Re' — Re satisfies the 
first Cauchy equation 


¥(x + y) = (x) + Hy) (4.3) 


for all x and y in Re’, and suppose that w is continuous. Then there is a real 
number c such that 


W(x) = cx, (4.7) 
all x. 
Proof. By induction from Eq. (4.3), ¥(nx) = m(x), all positive integers 


n. Next, let x = (m/n)t, m, n positive integers. Then nx = mt so (nx) = 
(mt), whence m/(x) = nm(t). We conclude that for all positive integers m 
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and n and for all ¢ real, 
m m 
v2) = = (0). (4.8) 


Suppose y(1) = c. Since ¢ can be any positive real number, we let ¢ = | in 
(4.8) and conclude that 


(7) = cr (4.9) 


for all positive rationals r. Since (4.9) holds for all positive rational 
numbers r, (4.7) follows for all positive real numbers x by taking limits on 
both sides of (4.9). (More precisely, by density of the rationals, we find a 
sequence of rationals r,, r,,...,7,,... Such that r, > x. Then continuity 
of y implies that y(7,) > (x). But we know that (7,) = cr,, so y(r,) > 
cx.) This proves the theorem if Re’ = Re*. 

If Re’ = Re, then note that YO) = 0 = c0 follows immediately from 
(4.3). If x is negative, then y(x) = ¥(0) — ¥(—x) = — Y(—x) = -— c(-~x) 
= cx. | 


Remark: This theorem (and the next three) hold if ¥(x) is assumed 
continuous at only one point x. This observation is due to Darboux 
[1875]. To see why Darboux’s observation holds in the present case when 
Re’ = Re, we suppose that (x) satisfies (4.3) and is continuous at xo. 
Then lim y¥(¢) = yx). For any other x, we have 

t>Xo 


lim yu) = timp [(u — x + X0) + (x — %)] 
tim + (x — X9)] 

lim [¥(2) + ¥(x — x0)] 

= W(X) + Hx — %) 

= (xX + X — Xo) 


= (x). 


Thus, y is continuous at all x. If Re’ = Re*, the argument is slightly more 
complicated. 


LemMa. Let Re’ = Re or Re*. If W: Re’ > Re satisfies the second Cauchy 
equation 


W(x + y) = o(x)W(y) (4.4) 
for all x and y in Re’, then either (x) = 0, all x, or W(x) > 0, all x. 
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Proof. Suppose ¥(xo) = 0. If Re’ = Re, then for all x € Re’, 
V(x) = v(x — Xo) + Xo] = U(x — Xo)d(X) = 0. (4.10) 
If Re’ = Re*, then for all x >xp, (4.10) holds, and so ¢(x) = 0. If 


y © Re* andy < Xp, then my > xo, some positive integer n. Hence, y(ny) 
= 0. But 


Wry) = vy)’; 


so ¥(y) = 0. 
Suppose next that for all x in Re’, (x) # 0. Then for all x in Re’, 


H)= > +3) 122) >2 
a 


THEOREM 4,2. Let Re’ = Re or Re*. Suppose W: Re’ — Re satisfies the 
second Cauchy equation 


W(x + y) = (xy) (4.4) 
for all x and y in Re’, and suppose that is continuous. Then either 
yp=0 
or 
Y(x) = e%, 


some real constant c. 

Proof. Suppose y # 0. Let f(x) = In {(x). By the lemma, f is well- 
defined, since (x) > 0, all x. Then f(x + y) = f(x) + f(y) and so f 
satisfies the first Cauchy equation, Eq. (4.3). By Theorem 4.1, f(x) = cx, 
some real c. We conclude 

V(x) = ef = e®, 
as desired. a 


THEOREM 4.3. Suppose ~:Re* — Re satisfies the third Cauchy equation 


VOxy) = (x) + Wy) (4.5) 
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for all positive reals x and y, and suppose that f is continuous. Then 
o(x) = cln x, 
some real constant c. 
Proof. Let f(x) = {(e*). Then 
F(x + y) = y(e**™) = ylere”) = Y(e*) + He”) = f(x) + f(D). 
We conclude that f satisfies the first Cauchy equation, Eq. (4.3), for all real 
x and y. Thus, f(x) = cx, some real constant c. It follows that for all 
positive x, ¥(x) = fn x) = c In x. | 
COROLLARY. Suppose W: Re > Re satisfies the third Cauchy equation 
H(xy) = o(x) + Hy) (4.5) 
for all reals x and y, and suppose that W is continuous. Then py = 0. 
Proof. ¥(x) = c ln x for all x € Re*. But 
¥(0) = ¥(0 - 0) = ¥(0) + ¥(0), 
so ¥(0) = 0. By continuity of y, c In x +0 as x +0, so c = 0. Therefore, 


(x) = 0, all x 2 0. 
Finally, if x is negative, 


¥(x) = ¥[(-1)(-x)] = 0-1) + -x) = W(-D). 
Thus, ¥(x) = ¥(— 1), all x < 0. By continuity, Y(—1) = y(0) = 0. | 


THEOREM 4.4. Suppose ~:Re* — Re satisfies the fourth Cauchy equation 


¥(xy) = ¥(x)¥(y) (4.6) 
for all positive reals x and y, and suppose that \ is continuous. Then either 
y=0 

or 
Wx) = x 


some real c. 
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Proof. As in the proof of Theorem 4.3, let f(x) = ¥(e*). Then f satisfies 
the second Cauchy equation (4.4). Proceed from there. 


CorROLLarRY. Suppose ~: Re — Re satisfies the fourth Cauchy equation 


¥Cxy) = ¥(x)¥(y) (4.6) 
for all real x and y, and suppose that W is continuous. Then either 
(a) y=0, 


or for some real c, 


x if x>0, 
(b) W(x) =4(-x) if x <0, 
0 if x=0, 


or for some real c, 


x if x>0, 
(c) W(x)=)-(-x) ff x <0, 
0 if x=0. 


Proof. By Theorem 4.4, ¥(x) =0 or ¥(x) = x° holds for all positive 
reals. If the former, then x negative implies that 


¥(x) = ¥(— IY(— x) = 0, 


so y = 0 for all negative x. By continuity, y = 0. 
If ¥(x) = x‘, all x positive, and if x is negative, then 


¥(x) = ¥(- 1)(— x). 


Moreover, 

¥(— I-11) = v1) = HCD Y() = FE = 1. 
Thus, ¥(— 1) = 1 or ¥(—1) = — 1. In the former case, using continuity, 
one concludes that y has the form (b). In the latter case, one concludes 
that y has the form (c). r | 


Remark: This Corollary does not hold if f is only continuous at a point. 
For example, the following function y is continuous everywhere but at 0, 
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and satisfies the fourth Cauchy equation: 


—_/1 if x#0, 
Wx) ie if ee 0: 


4.2.2 Derivation of the Possible Psychophysical Laws 


Luce [1959] observed that we can sometimes derive the possible forms of 
psychophysical functions y if we assume that they must be continuous and 
we know (on some grounds) what types of scales f and g form. We present 
Luce’s results here.* We derive the possible psychophysical functions in 
four cases, where f is taken to be either a ratio scale or an interval scale 
and g is taken to be a ratio or an interval scale in the wide sense.t 
(Additional cases are handled by Luce.) The results are summarized in 
Table 4.4. 


Table 4.4. Possible Psychophysical Laws 


Psychological Functional Psychophysical 
Physical Scale Scale Equation Function 
f:A— Ret g:A— Ret (kx) = K(k)(x) Wx) = axF 
Tatio scale ratio scale in k > 0, K(k) > 0 

the wide sense 
f:A— Ret g:A—Re Wkx) = W(x) = alnx + B 
tatio scale interval scale in K(k)Y(x) + C(k), or 

the wide sense k > 0, K(k) >0 W(x) = ax? + 8 
f:A—Re g:A— Ret Wkx +c) = y constant 
interval scale fatio scale in K(k, c){(x), 

the wide sense k > 0, K(k, c) > 0 
f:A— Re g:A—>Re Y(kx +c) = (x) =axr+B 


interval scale 


interval scale in 
the wide sense 


K(k, c)\(x) + Clk, ¢), 


k>0, K(k,c)>0 


It should be mentioned that the results that follow have much wider 
applicability than just to the determination of the possible psychophysical 
laws. Indeed, given any two scales of known scale type that are related by 
some unknown law, we can derive the possible forms of this law by Luce’s 
methods. In Exer. 6, we explore the application of this idea to laws such as 
Newton’s Law and Ohm’s Law relating physical variables to each other 


*For a criticism of this approach and a reply, see Rozeboom [1962a, b] and Luce [1962]. 
TWe use the term “in the wide sense” because we are thinking of derived measurement. 
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and laws relating geometrical variables such as volume and radius of a 
sphere. 

We shall always assume that an interval scale has a range of all real 
numbers, but that a ratio scale can be limited to a range of positive reals. If 
the physical scale f has range Re’, where Re’ is Re or Ret, then we shall 
assume that f attains all possible values in its range; that is, every real 
(positive real) number is a (potential) scale value for some stimulus. Thus, 
y will have domain all of Re’ (and range the range of the psychological 
scale g). Hence, if A is the set of objects being measured both physically 
and psychologically, our assumptions can be summarized as follows: 


If f is a ratio scale, f(A) = Re*. 
If f is an interval scale, f(A) = Re. 
If g is a ratio scale, g(A) & Re*. 
If g is an interval scale, g(A) & Re. 


The general procedure of Luce is to use the observation, based on our 
discussion of Section 2.5, that an admissible transformation of f leads to an 
admissible transformation of g (in the wide sense).* Using Luce’s observa- 
tion, we shall obtain a functional equation for the psychophysical function. 
For example, suppose both f and g are ratio scales, the latter in the wide 
sense. Then multiplication by a positive constant k is an admissible 
transformation of f and so must result in an admissible transformation of 
g, that is, multiplication by a positive constant K(k). Thus, for all a in A, 
the set of objects being measured both physically and psychologically, 
[kf (a)] = K(k)f[f(a)]. Thus, for all x in the range of f, we have the 
equation 


o(kx) = K(k)y(x). (4.11) 


By our conventions, all positive reals x are attained as scale values f(a), so 
(4.11) holds for all positive reals x. To solve Eq. (4.11), we reduce it to one 
of the Cauchy equations solved in the previous section. The procedure 
under other assumptions about the scale types of f and g is similar. 


THEOREM 4.5 (Luce). Suppose the psychophysical function is continuous 
and suppose f:A —- Re* and g:A — Re* are both ratio scales, the latter in 
the wide sense. Then 


(x) = axF, 
where a > 0. 


*It is this observation that Rozeboom [1962a] criticizes. Luce [1962] admits that his 
procedure is subject to difficulties if the psychophysical law involves “dimensional parame- 
ters” that can be transformed only by transformations that depend on the transformation of 
the physical parameter f. (Compare the Remark at the end of Section 2.5.) 
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Proof, By our convention, y: Re* — Re*. We have already shown that y 
satisfies the functional equation (4.11), for k > 0 and K(k) > 0. Setting 
x = 1 in (4.11, we obtain 


vk) = K(k)y(1), (4.12) 


all k > 0. Since the range of y is contained in Re*, ¥(1) > 0. Thus, 
K(k) = (k)/¥(1). Now Eq. (4.11) becomes 


wlkx) = Y(k)y(x)/p(1). 


For all x € Re*, let y(x) = In [y(x)/¥(1)]. This function is well-defined, 
since the range of y is Re* and so y(x)/y(1) > 0. We have 


y(kx) = Inf Y(kx)/¥(1)] 


— in| YOY) 
. in| vy) 
Wk) Vx) 
in] way | * n| vl) | 
= yk) + 13). 


Since y is continuous, so is y, and thus by Theorem 4.3, 
y(x) = Bln x = In x8. 


It follows that ¥(x) = ae™ = ax®, where a = y(1). Finally, note that 
a > 0, since y(1) > 0. a 


The next result is obtained immediately from the proof of the preceding 
theorem. 


CorOLLary |. Suppose ~:Re* — Re* is continuous and satisfies the 
functional equation 


Y(kx) = K(k)y(x) (4.11) 
for k > 0 and K(k) > 0. Then Y(x) = ax®, where a > 0. 


CorROLLARY 2. Suppose y:Re — Re* is continuous and satisfies the func- 
tional equation 


Y(kx) = K(k)Y(x), (4.11) 
for k > O and K(k) > 0. Then (x) =c, for some constant c. 
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Proof. In the proof of Theorem 4.5, y(x) is well-defined for all x © Re. 
Use the Corollary to Theorem 4.3 to conclude that y(x) = 0. Thus, 


0 = In[ ¥(x)/¥(1)], 
so ¥(x)/Y(1) = 1, so p(x) = YI), all x. a 


THEOREM 4.6 (Luce). Suppose the psychophysical function is continuous, 
suppose f:A — Re* is a ratio scale and g:A — Re is an interval scale in the 
wide sense. Then either 


W(x) = alnx + 8B, 
or 
Y(x) = ax? + 6. 


Proof.* By our convention, y:Re* — Re. Since admissible transforma- 
tions of f (similarity transformations) lead to admissible transformations of 
g (positive linear transformations), we derive the functional equation 


(kx) = K(k)p(x) + C(k), (4.13) 


where k > 0 and K(k) > 0. 

Case 1. K(k) = 1. Here, we define y(x) = e”. Thus, y:Re* — Re* 
and y is continuous. Since K(k) = 1, Eq. (4.13) becomes y(kx) = 
D(k)y(x), where D(k) = e > 0. 


By Corollary 1 to Theorem 4.5, we conclude that y(x) = 6x*%, where 
6 > 0. Taking logarithms, we obtain ¥(x) = a In x + B, where B = In 6. 


Case 2. K(k) # 1. We shall assume that y(x) satisfies (4.13), and (x) is 
not constant. For if y is constant, then ¥(x) = 0- x? + 8, some 6, and we 
are done. Using Eq. (4.13), we find that 


Wk kax) = K(kyky)p(x) + C(k ik). (4.14) 
Also using Eq. (4.13), 


Wk kax) = K(k) p(kox) + C(ki) 
= K(k,)[ K(k2)¥(x) + C(k,)] + C(k,), 


*The author thanks J. Rosenstein for some of the ideas of this proof. 
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so 

W(k,k,x) = K(k,)K(k2)o(x) + K(k,)C(k,) + C(k,). (4.15) 
Similarly, 

(k,k,x) = K(k,)K(k,)¥(x) + K(k,)C(k,) + C(k,). (4.16) 
Note that if 

av(x) +b=c¥(x)+d 
for all x € Ret, then 
(a-—c)¥(x)=d-—b 

for all x € Re*. Thus, if a — c #0, 


d—b 


a-ec 


¥(x) = 


is a constant, contrary to assumption. Thus, we conclude a =c and 
therefore b = d. By this line of reasoning, Eqs. (4.14) and (4.15) imply 


K(k, k,) = K(k,) K(k) (4.17) 
for all k,, k, € Re*, and Eqs. (4.15) and (4.16) imply 
K(k,)C(k2) + C(k,) = K(k,)C(k,) + Ck) (4.18) 
for all k,, k, € Re*. 
Since K(k) #1 by the assumption of Case 2, there is k, so that 
K(k,) # 1. Then for all k, Eq. (4.18) implies that 


Thus, since K(k,) # 1, 


C(k) = cee] KE | (4.19) 


Using this value of C(k) in Eq. (4.13), we have 


Uke) = K(x) + CC) Rc | 


C(k,) | C(k,) 


= Kw] Hoo Pa T= Eh)’ 


= K(k,) 
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Since y was assumed not constant, there is x9 so that 


C(k)) 
W(xo) ¥ Tome) 
Thus, 
C(k,) 
gan C(k,) 
W(X) 1 7 K(k,) 


It follows that K is a continuous function of k. Thus, by Theorem 4.4, Eq. 
(4.17) implies that either 


K(k) = 
or 
K(k) = k8, 
some f. If K(k) = 0, then Eq. (4.13) implies that 
Y(x) = ¥(1 - x) = C(I), 


so y is a constant, contrary to assumption. Thus, K(k) = k*, some . 
We next claim that if 


(ky) 
f) = 
1-k, 1-k?’ 
and if 
w*(x) = yx? + 4, 


then y* satisfies Eq. (4.13) for every y. (The number 6 is well-defined, 
since 1 # K(k,) = k,*.) The claim follows, since 


(kx) = ykPxF + 8 
= kB(yx8 + 8) + 5(1 — kF) 
ted en 


= k#(yx? + 8) + ———— 7 


= kPy*(x) + C(k), 
using Eq. (4.19). 
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Finally, we show that any continuous solution y~ of Eq. (4.13) is of the 
form ax? + 6 For suppose p*(x) = yx? + 8 and (x) ¥ ¥*(x,), some 
Xq > 0. Suppose ¥(x9) > P*(xo). Now given x > 0, x = kxy some k > 0. 
Thus, 

¥(x) = Y(kxo) 

= K(k)p(xo) + C(k) 

> K(k)y*(x9) + C(k) 

= P*(kxo) 

= p*(x). 
Let A(x) = ¥(x) — *(x). Then A(x) > 0, all x € Re*, so A is a function 
from Re*t to Re*. Moreover, by Eq. (4.13), A must satisfy the functional 
equation 


A(x) = K(k)A(x), (4.20) 


k > 0, K(k) > 0. Since both y and y* are continuous, so is A. Thus, 
Corollary 1 to Theorem 4.5 applies to A. We conclude that A(x) = ax?, 
a>O. Substituting this into Eq. (4.20), we find that b = 8. Thus, A(x) = 
ax®, and 
Y(x) = Y*(x) + A(x) 
= yx8 +8 + ax? 
= (y + a)x? + 6. 


Thus, y has the form ax? + 6, fora = y + a. 
A similar proof applies if (x9) < y*(%9). | 


THEOREM 4.7 (Luce). Suppose the psychophysical function w is continuous, 
Jf:A — Re is an interval scale, and -g:A — Re* is a ratio scale in the wide 
sense. Then w is constant. 


Proof. By our convention, : Re — Re*. Once again, by using admissible 
transformations, we derive a functional equation: 


(kx + c) = K(k, c)p(x), (4.21) 


k > 0, K(k, c) > 0. Let c = 0 in Eq. (4.21), note that y is nonzero, since its 
range is contained in Re*, and apply Corollary 2 to Theorem 4.5. 


THEOREM 4.8 (Luce). Suppose the psychophysical function w is continuous 
and f:A — Re and g:A — Re are both interval scales, the latter in the wide 
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sense. Then 
Y(x) = ax + B. 


Proof. By our convention, ~: Re — Re. Once again, we derive a func- 
tional equation 


Y(kx + c) = K(k, c)p(x) + Ck, ¢), (4.22) 
where k > 0, K(k, c) > 0. If we set c = 0, then Eq. (4.22) reduces to Eq. 
(4.13) and so by the proof of Theorem 4.6, at least for x > 0, ¥(x) = 
aln x + B or (x) = ax® + 6. In the former case, set kK = c = 1 in Eq. 
(4.22). It follows that for x > 0, 

a In(x + 1) + B= K(1, 1)[aInx] + K(1, 1)B + C(I, 1). 


Differentiating with respect to x, we obtain 


a K(, l)a 
x+1 x : 
If a # 0, we obtain 
K(, 1) = —— (4.23) 
, x+l1?’ : 


all positive x. Setting x = 1 and x = 2 in Eq. (4.23), we obtain K(1, 1) =} 
and K(1, 1) =, respectively, a contradiction. The conclusion is that a 
must be 0. Thus, y(x) = a In x + B = 8, for all x positive. 

Now given k > 0 and c, find x > 0 so that kx + c > 0. Then y(kx + c) 
= B and y(x) = B, so Eq. (4.22) implies that 


B= K(k, c)B + C(k, c), (4.24) 


all k > 0, all c. Now given y < 0, choose c so that y = x + c, some x > 0. 
Then, using (4.22) and (4.24), we have 


¥(y) = KCI, c)y(x) + C(I, ¢) 
= K(1,c)B + C(I, c) 
= B. 
Thus, y = B. 


In the case that ¥(x) = ax? + 6, x > 0, assume a # 0. For otherwise, 
¥(x) = 4, all x, follows as in the previous case. Using (x) = ax* + 6, 


172 Applications to Psychophysical Scaling 4.2 


x > 0, set k = c = 1 in Eq. (4.22). It follows that for x > 0, 
a(x + 1)? + 6 = K(1, lax’ + K(1, 1) 6 + C(I, 1). 
Differentiating with respect to x, we obtain 
aB(x + 1)8~' = aBK(I, 1)x4-', (4.25) 


all positive x. Since a # 0, setting x = 1 and x = 2 in (4.25) gives B2°-'! 
= BK(1, 1) = B(3/2)8—'. We conclude that 8 = 0 or B= 1, so ¥(x) = 
a+ 6 or (x) = ax + 4, all x positive. 

Thus in all cases, we have, for all x positive, y(x) = ax + B, some a, B. 
We wish to show this for all real x. Given any x > 0 and c > 0, we know 
that 


y(x +c) = a(x +c) + B= ax + (ac + B). (4.26) 
Also, using Eq. (4.22), we have 
y(x +c) = K(1, c)¥(x) + C(I, ¢), 


so 
y(x + c) = K(I, c)ax + K(1,c)B + C(I, c). (4.27) 


Thus, equating the right-hand sides of Eqs. (4.26) and (4.27), we conclude 
that for all x > 0 andc > 0, 


ax + (ac + 8) = K(1,c)ax +[K(1,c)B + C(I, c)]. 
Now as in the proof of Theorem 4.6, 


ax+b=cex+d 


for all x > 0 implies that a = c and b = d. Thus, a = K(1, c)a, or, since 
a #0, K(1, c) = 1, all c > 0. Moreover, 


ac+ B= K(1,c)B + C(l,c) = B+ CU, 0), 
or C(1, c) = ac, all c > 0. Now, given any x, choose c > 0 such that 
x +c> 0. Then 
ax+act+ B=a(x+c)+B 

= ¥(x +c) 

= K(1, c)¥(x) + C(I, c) 

= (x) + ac. 
It follows that y(x) = ax + B. ] 
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Exercises 


1. In the measurement of apparent length of straight lines, a doubling 
of physical length leads to (essentially) a doubling of apparent length. 
More generally, suppose multiplication of physical length by a (positive) 
rational amount r leads to (essentially) multiplication of apparent length 
by an amount r. Show that this is enough to conclude that, if the 
psychophysical function y is continuous, then y(x) = ax, some a. 


2. (Aczél [1966, p. 42]) Suppose y: Re* — Re satisfies 


H(x + y) = o(x) + Hy) 


and 


H(xy) = ¥(x)¥(y), 


for all positive x and y, and suppose y is continuous. Show that y(x) = x 
or ¥(x) = 0, all positive x. 
3. (Aczél [1966, p. 43]) The functional equation 


W(2#)- HO) + 40), 


is called Jensen’s equation, after J. L. W. V. Jensen [1905, 1906]. If 
wy: Re — Re satisfies Jensen’s equation for all real x and y, and y is 
continuous, show that ¥(x) = cx + a. [Hint: Set y(x) = Y(x) — ¥(0).] 


4. (Aczél [1966, pp. 46, 47]) (a) Suppose m 22 and w:Re > Re is 
continuous and satisfies 


W(x, + X2 pace +x,) = ¥(x,) + ¥(x2) ea +Y(x,), 


all x,, x2,...,X, © Re. Show from the result of Theorem 4.1 that y(x) = 
cx, 


(b) Suppose 7 2 2 and wp: Re > Re is continuous and satisfies 


—— wee fn) _ YOu) + vo) + an + V(%q) 


n n 


all x,, X2,..., x, © Re. Show that Y(x) = cx + a. Use the result of Exer. 
3. 

5. (Aczél [1966, pp. 49, 50]) Suppose yw: Re > Re is continuous and 
satisfies 


(x + vy) = W(x) + Cy) + Y(x)(y), 


all x,y € Re. In Chapter 5, we shall call such ~ quasi-additive. If W is 
quasi-additive, show that either ¥(x) = — 1 or Y(x) = e* — 1. 
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(a) Give a quick proof by setting y(x) = ¥(x) + 1 and reducing to 
the second Cauchy equation. 

(b) An alternative proof goes as follows: 

(i) Show that Y(nx) = [1 + d(x)" — 1. 

(ii) Using part (i) with n = 2, prove that ¥(x) 2 —1, all x. 

(ili) Using quasi-additivity, prove that ¥(1) = — 1 implies yx) 
= — I, all x. 

(iv) If ¥(1) ¥ — 1, let c = In [1 + Y())]. Using part @), prove 
that Y(m/n) = e°"/" — 1, all m/n = 0. 

(v) Using the result of part (iv) and continuity, conclude that if 
W(1) # — 1, then f(x) = e* — I, all x 20. 

(vi) Show that part (v) implies that if y(1) + — 1, then 40) = 0. 
Then use quasi-additivity with y = — x to prove from this that yx) = 
e* — I, all x € Re. 

6. (Luce [1959]) Theorems 4.5 through 4.8 hold for derived measure- 
ment in general, not just for psychophysical scaling. Thus, for example, if 
the independent variable f and the derived variable g are both ratio scales, 
and g = ¥(f) for.f continuous, then Theorem 4.5 implies that p is a power 
function. Show that the following physical and geometrical laws are of the 
forms called for by Theorems 4.5 through 4.8:* 

(a) For a sphere, V = far, where V = volume, r = radius. 

(b) Ohm’s Law: Under fixed resistance, voltage is proportional to 
current. (Voltage and current are ratio scales.) 

(c) Newton’s Law of Gravitation: F = G(mm’/r), where F is the 
force of attraction, G is the gravitational constant, m and m’ are the masses 
of two bodies being attracted, and r is the distance between the bodies. 

(d) If a body of constant mass is moving at velocity v, then its 
energy is av? + 5, a, 5 constant. (Energy is an interval scale.) 

(e) If the temperature of a perfect gas is constant, then as a function 
of pressure p, the entropy of the gas is a logp + B,a, B constant. (En- 
tropy defines an interval scale.) 

(f) For a square, A = /?, where A is area and / is length. 


7. (a) Show that if the variables on both the independent (physical) 
and dependent (psychological) scales are dimensionless (that is, define 
absolute scales), then there is no restriction on the psychophysical func- 
tion. 


*Luce [1959] and Rozeboom [1962a] give examples of pliysical laws that seem to violate 
the conclusions of Theorems 4.5 through 4.8, for example the exponential law of radioactive 
decay. Luce [1959, 1962] argues that such violations occur if the mdependent variable enters 
the equation in a “dimensionless fashion,” and argues that the conclusions of Theorems 4.5 
through 4.8 hold only if there are no “dimensional parameters” present. The admissible 
transformations of such dimensional parameters are not determined by a measurement 
theory, but are included in the statement of a law. See the Remark at the end of Section 2.5 
for a more detailed discussion. 
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(b) Dimensionless scales can be constructed rather easily. For exam- 
ple, suppose a variable x defines a ratio scale, but some value x, is taken as 
a reference value and the scale used is x/x9. Show that x/x, defines an 
absolute scale. 


8. (a) In Theorem 4.5, show that B is independent of the units of f and of 
g; that is, 8B doesn’t change if admissible transformations are applied to 
either f or g. For example, show that if y relates kf to g, then y(x) = a’x, 
for the same 8 as in the relation of f to g. 

(b) In Theorem 4.6, if ¥(x) = a ln x + B, show that a is indepen- 
dent of the unit of f. What about B? 

(c) In Theorem 4.6, if Y(x) = ax* + 8, show that 8 is independent 
of the units of both f and g. What about 5? What about a? 

(d) In Theorem 4.8, what is the dependence of a and of 8 on the 
units of f and g? 


9. (Luce [1959]) Derive a functional equation for the psychophysical 
function y in each of the following cases: 
(a) f is a ratio scale, g is a log-interval scale (cf. Section 2.3). 
(b) f is an interval scale, g is a log-interval scale. 
(c) f is a log-interval scale, g is a ratio scale. 
(d) f is a log-interval scale, g is an interval scale. 
(e) f is a log-interval scale, g is a log-interval scale. 


10. (Luce [1959}) (a) In case (a) of Exer. 9, if y is continuous, show that 
either y(x) = de®” or (x) = ax. [Hint: Take In of the functional 
equation, let y = In y, and reduce to one of the functional equations in 
Table 4.4.] 

(b) In case (b) of Exer. 9, if y is continuous, show that (x) = ae’. 
[Hint: The method of proof for (a) applies.] 

(c) In case (c) of Exer. 9, if y is continuous, show that y(x) is 
constant. [Hint: Take y(In x) = yx) and reduce to one of the functional 
equations in Table 4.4.] 

(d) In case (d) of Exer. 9, if y is continuous, show that ¥(x) = 
a ln x + £.[Hint: The method of proof of part (c) applies.] 

(e) In case (e) of Exer. 9, if y is continuous, show that y(x) = ax’. 
[Hint: Take In of the functional equation, let y = In yp, and reduce to one 
of the functional equations of Table 4.4.] 


11. (a) Suppose f: Re X Re — Re satisfies 
f(x, + Vy X2 + V2) =F (%p X2) + FC, Ya) 


for all x,, x3, y;, ¥2 in Re, and suppose f is continuous. Find all such 
functions f. 
(b) Generalize to continuous f: Re” — Re satisfying 


f(x + Ys Xq FH Vay Xq FIVn) = Sey, X35 6 Hq) HIV y «Yad 


12. Show that the solutions given to Cauchy’s equations in Theorems 4.1 
through 4.4 hold for functions y which are assumed monotone increasing 
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rather than continuous, for such functions are continuous at at least one 
point. 

13. (Aczél [1966, pp. 105-106], Eichhorn [1978, pp. 3-4, 10-11]) Func- 
tional equations have a wide variety of applications in economics, as 
illustrated in the recent book by Eichhorn [1978]. This exercise and Exer. 
15 present some of these applications. Suppose /(K, T) represents the 
compound interest earned by a capital K during a time interval of length f. 
Thus, J:Re*+ x Re* — Re*. If the interest accrued is not changed if K is 
divided into two separate capital investments K, and K,, we have 


(i) I(K, + K,, t) = I(K,, t) + I(K, t), K,, K,, t > 0. 


Also, if the interest rate doesn’t change over the length of an account, the 
amount of interest on /(K, ¢,) during a time interval of length ¢, is equal to 
the amount of interest on K during a time interval of length t, + ¢,. Hence, 


(ii) I[I(K, t)), | = ICK, t) + 4), K, t,, t, > 0. 


Finally, note that J is monotone increasing in each variable. 

(a) Since J is monotone increasing, Exer. 12 implies that the results 
of Theorems 4.1 through 4.4 apply. Show from the above assumptions that 
there is h(t) > 0 so that 


I(K, t=) = h(t)K. 
(b) Show froin (ii) that 
A(t, + t2) = A(t,)h(t,), t,t, >0. 


(c) Show that I(K, 1) = Kq', some q > 1. Hence, the standard way 
of computing compound interest follows from the simple assumptions we 
have made. 

14. (Eichhorn [1978, pp. 51-52]) (a) Suppose y: Re” — Re is monotone 
increasing in each variable and satisfies 


¥(x + y) = ¥(x) + ¥(y), 


all x, y © Re”. Show that there is a vector ¢ = (c,, ¢2,..., ¢,)in Re” such 
that 

v(x) =c-x, 
where c - x is the dot product c,x, + ¢oX¥2 +--+ +0¢,Xp- 


(b) Show that the conclusion of (a) still holds if the domain of y is 
changed to all nonnegative real vectors of length n. (Hint: Note that the 
proof of Theorem 4.1 can be modified to hold for the domain of y equal to 
the nonnegative reals.) 
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(c) Show that the conclusion of (a) still holds if the domain of y is 
changed to all nonnegative real vectors of length n except the vector 0. 


15. (Eichhorn [1978, pp. 53-58]) In Section 2.6.2, we studied index 
numbers. We can look at an index number (consumer price index, con- 
sumer confidence index, etc.) as a function J:A X A — Re, where A is the 
set of all nonnegative real vectors of length n except the vector 0. One 
often takes 

I(x, u) = a-u/B-x = 2au,/ZB;x;, 

where X = (x), X,...,X,) and u=(u,,u,...,u,) are in A, a= 
(a), a,...,a,) and B = (B,, B,,..., B,) are vectors of positive reals, and 
> means dot product. For price indices, x; = p(t), the price of good i in 
year t, and u, = p,(0). The Paasche price index (Exer. 5, Section 2.6.2) uses 
(a4=28 = qt), the quantity of good i in an average market basket in year 
t, and the Laspeyres price index (Exer. 5, Section 2.6.2) uses a; = 8; = 
q(0). From I(x, u) = a- u/B- x and a, > 0, B; > 0, all i, it follows that 


(i) I(x, u + v) = I(x, u) + I(x, v); 

2 1 1 1 

tii) I(x + y, u) 7 I(x, u) = I(y, u) : 
(iti) I(x, u) > 0, allx, ue A. 


This exercise presents a sketch of the proof that if / is monotone decreas- 
ing in its first variable and monotone increasing in its second variable, then 
these three conditions essentially determine the general formula 


I(x, u) = a- u/B-x. 


(a) Use Exer. 14 to show from monotonicity, (ii), and (iii) that there 
is a vector b(u) = [b,(u), 5,(u), . .., 5,(w)], with each b(u) > 0, such that 


1 
I(x, u) 
(b) Using the results of (a) and (i), show that 
1 wee Fi 1 
b(u+yv)-x b(u)-x  Db(v)-x- 
(c) Show that for all i, 
pay rere f 1 
b(ut+yv) 5(u) BV)” 
(d) Show that for all i, there is a vector a’ with all entries positive 
such that 


= b(u)- x. 


! i.e 3a 
3a)" ai, 


(See the hint to Exer. 14b.) 
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(e) Use parts (a) and (d) to show that for all x, u, 
1 x xX. Xn 

= + tee + : 

I(x,u)al-u sas u a”"-u 


(f) Let e' = (0,...,0,1,0,...,0), with a 1 in the ith compo- 
nent. Show that the result of (e) implies that for all u in A, 


-1 
Hel +e -- teu) = (= pace td inset : 
a'-uss as u a”-u 


(g) Show that for some vector a with each component a, > 0, 
Keb +e+--- +e*,ub=a-u, 


all uin A. 
(h) Show from the results of (f) and (g) that 


a!-u eee 


@l-w(1 se 5 Sead r= 
(i) Show that 
[a' — y(uja]-u=0, 


where 


y(u) =1+ 


1, 

2. ho teen 
a’-u 

and y(u) > 0 for all uin A. 

(j) It follows from the result of part (i) that a’ = A,a, A, > 0. This is 
because if a! > 0 and a> 0 are linearly independent, the result of (i) 
cannot hold for all u, since it cannot hold for any u not perpendicular to 
the plane spanned by a! and a. Similarly, we conclude that for all i, 
a’ = da, A, > 0. Using this conclusion, show from the result of part (e) 
that for all x, u in A, 


for some B; > 0. 
(k) Show that if in addition J(x, x) = 1 for all x in A, a reasonable 
assumption for price indices, then 


a:-u 


K(x, u) =— 


Note: For further applications of functional equations to the study of price 
indices, see Eichhorn [1978, Chapter 8]. 
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4.3 The Power Law 
4.3.1 Magnitude Estimation 


In this section, we shall present a justification of the power law and then 
investigate some consequences of this law. 

Judgments of subjective loudness are made in the laboratory in various 
ways. Stevens [1957] classifies four different methods. One of the most 
common is the method of magnitude estimation, which we encountered in 
our discussion of choice of most important variables in Section 2.6. The 
subject hears a reference sound (or light or other stimulus) and is told to 
assign it a fixed number, say 100. Then he is presented other sounds (or 
lights, etc.) and asked to assign them numbers proportionate to the 
apparent loudness (or brightness, etc.). For example, if a sound seems 
twice as loud as the reference sound, it should be assigned a value of 200; 
if it sounds one half as loud, it should be assigned a value of 50. 
Magnitude estimation may also be performed without a reference stimulus, 
letting the subject pick out his own reference stimulus. The data of Fig. 4.2 
shows the results of one magnitude estimation for loudness, plotted in 


log 200 
log 100 


log 50 


log 20 
log 10 


log 5 


Log of magnitude estimation 


log 2 


log 1 
40 50 60 70 80 90 


Log intensity 


Figure 4.2. Magnitude estimation judgments of loudness in log-log coordinates, fitted with a 
straight line of slope 0.3. Adapted from Galanter and Messick [1961, p. 366]. Copyright 1961 
by American Psychological Association. Reprinted by permission. 
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log-log coordinates. In log—log coordinates, the psychophysical function, if 
it were a power function, would appear as a straight line whose slope is the 
exponent in the power law. Figure 4.2 shows a straight line which has been 
fitted to the data. This line has a slope of .3, and the number .3 has been 
used by Stevens as an estimate of the exponent of the power function for 
loudness. 

A variant of magnitude estimation is magnitude production. An experi- 
menter names magnitudes and asks the subject to adjust stimuli to match 
those magnitudes. In ratio production, a subject is asked to select a stimulus 
that is one-half as loud as a given stimulus, or one that is three times as 
loud, etc. Finally, in ratio estimation, the experimenter presents two stimuli 
and asks the subject to estimate the ratio between them. 

Subjects seem to feel quite comfortable with all these procedures. 
Stevens argues that the results of such scaling procedures are ratio scales of 
loudness and other psychological variables. He doesn’t “prove” that mag- 
nitude estimation, for example, leads to a ratio scale, because he develops 
no representation and uniqueness theorems. Rather, he uses the principle 
we have discussed in Chapter 2, that an admissible transformation is one 
that keeps “intact the empirical information depicted by the scale” 
(Stevens [1968, p. 850]). With this idea, it seems plausible that the only 
admissible transformations of judgments obtained using methods like 
magnitude estimation are those that preserve ratios. Hence, it is suggested 
that magnitude estimation gives rise to a ratio scale. 

Now measurement of physical intensity of a sound (or intensity of a 
light, etc.) is on a ratio scale. Hence, if the corresponding psychological 
scale of loudness (or brightness, etc.) is also a ratio scale (in the wide 
sense), and if the psychophysical function is continuous, it follows from 
Theorem 4.5 that the psychophysical function y is a power function.* 

It should be remarked that not all psychophysicists accept the argument 
that magnitude estimation leads to a ratio scale. Indeed, there are some 
who claim that the entire procedure of magnitude estimation is nonsense. 
(See, for example, Final Report of the British Association for the Advance- 
ment of Science [1940] or Moon [1936] for early criticisms.) Perhaps more 
important, there are those who feel that specifying uniqueness of the 
derived scale of loudness (or brightness or subjective duration) before 
specifying the psychophysical law is begging the question, and so the whole 
approach described above is useless. We shall not attempt to settle this 
issue here. However, in Section 4.4, we shall present axioms that, if 
satisfied, imply that magnitude estimation leads to a ratio scale. 


*If the psychological scale is only an interval scale (in the wide sense), then it follows from 
Theorem 4.6 that the possible psychophysical laws are essentially either the logarithmic law or 
the power law. 
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4.3.2 Consequences of the Power Law 


The usual way of testing a proposed scientific law is to derive predic- 
tions from the law and subject these to test. These predictions can be new, 
or they can be previously observed results, testable by previously gathered 
data, or requiring new experiments or observations to be tested. In this 
subsection and the next, we shall derive some predictions or consequences 
of using the power law as the psychophysical law for the case of loudness, 
and discuss tests of these consequences. 

In particular, assuming the power law, it is possible to estimate the 
parameter 8 experimentally. In the case of loudness, Stevens [1955, and 
elsewhere] has estimated 8 for 1000-cps pure tones by having subjects use 
the method of magnitude estimation. As we pointed out above, he esti- 
mates that 8B ~ 0.3 = log,,2. A sample of such a magnitude estimation 
was given in Fig. 4.2. Exponents for other psychophysical variables are 
shown in Table 4.2. These were all determined by use of the method of 
magnitude estimation. 

Using a value of B = log), 2, we find that the formula 


L(x) = ¥U(x)) 

for loudness becomes 

L(x) = al(x)'8"?. (4.28) 
Recall that 

dB(x) = 10 logyo[ Z(x)/ Jo]. 

Hence, 

I(x) = [,10i08, 
It follows from Eq. (4.28) that 


L(x) = ol {0812/19 769BC2) JlO810?, (4.29) 


It is interesting to demonstrate that from Eq. (4,29) we may derive as a 
conclusion the observation made by Fletcher and Munson (Section 4.1.2) 
that an increase of 10 dB in intensity is equivalent to a doubling of 
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loudness. For, if dB(b) = dB(a) + 10, then by (4.29), 


L(b) = al, Jo8%0 2 1Q70(4B(a)+ 10) ie 2 


af {0810 cf 1976 4B(@) 10 ] lo8i9 2 


axl Job [ 191048) ]!969? | Q}oB0? 


L(a)- 2. 


Similar results hold for pure tones other than the 1000-cps tones, and for 
other kinds of noises. Indeed, Stevens [1955, p. 825] says that he is 
increasingly convinced by data that the Fletcher-Munson observation 
holds for all continuous noises of engineering interest. 


4.3.3. Cross-Modality Matching 


The power law has passed a basic test required of any such law: it 
accounts for known empirical data. An additional test may be provided by 
asking for new predictions. Stevens has used the idea of “cross-modality 
matching” to test the power law. Observers are apparently able to match 
the strengths of the sensations produced in two different modalities, for 
example, loudness and brightness. For instance, a person can adjust the 
brightness of a light to match the perceived loudness of a 50-dB pure tone 
at 1000 cps. In general, suppose two different psychological quantities such 
as loudness and brightness are each related to physical quantities by a 
power law. Suppose A, and A, are sets, f;:4,;—> Re and f,:A,— Re 
represent physical scales, and g,:4, > Re and g,:A,— Re represent the 
corresponding psychological scales. Then we have 


gi = ay fi (4.30) 
and 
8 = af. (4.31) 


Subjects are now asked to match a given a € A, to ana’ € A, of “equal 
sensation.” If the two power laws (4.30) and (4.31) are in effect, then the 
equal sensation function g,(a) = g,(a’) is given by 


a,[ f(a) ]® = a,[ f.(a’) | e 
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or 


In f,(a) = y + (B,/B)) In f(a’), 
where 


_ Ina, ~ Ina, 


eB 


Thus if for all (a, a’) with a judged equal in sensation to a’, f,(a) is plotted 
against f,(a’) in log-log coordinates, the graph should be a straight line 
whose slope is given by the ratio of the exponents. This prediction has 
been tested for various pairs of modalities, and the results come out quite 
well (Stevens [1959a, 1960, 1962]). See Figs. 4.3 and 4.4 for typical “equal 
sensation graphs.” 


Level of fainter noise in decibels re louder 


—50 —40 —30 —20 -10 0 
Level of fainter light in decibels re brighter 


Figure 4.3. Equal sensation function for loudness versus brightness. Results of adjusting a 
loudness ratio to match an apparent brightness ratio defined by a pair of luminous circles. 
One of the circles was made dimmer than the other by the amount shown on the abscissa. 
The observer produced white noises by pressing on one or the other of two keys, and he 
adjusted the level of one noise (ordinate) to make the loudness ratio seem equal to the 
brightness ratio. From Stevens [1960, p. 241]. Reprinted with permission of American 
Scientist, journal of Sigma Xi, the Scientific Research Society of North America. 
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100 


Force of handgrip in pounds 
° 


5 pressure on palm 
cold 
vibration — 60 ~ 
A electric shock — 60 ~ white noise 
2 B warmth 1000 ~ tone 


C lifted weights white light 


10 10? 10? 10* 10° 10° 10’ 
Relative intensity of criterion stimulus 


Figure 4.4. Equal sensation functions. Data obtained by matching force of handgrip to 
various criterion stimuli. Each point stands for the median force exerted by ten or more 
observers to match the apparent intensity of a criterion stimulus. The relative position of a 
function along the abscissa is arbitrary. The dashed line shows a slope of 1.0 in these 
coordinates. From Stevens [1960, p. 246]. Reprinted with permission of American Scientist, 
journal of Sigma Xi, the Scientific Research Society of North America. 


4.3.4 Attitude Scaling 


The procedures such as magnitude estimation that were developed to 
help scale sensory variables like loudness and brightness have also been 
applied in scaling attitudes or opinions, judgments of pleasantness of 
musical selections, judgments of seriousness of crimes, and so on. The idea 
of using psychophysical methods to study attitudes and opinions goes back 
to Thurstone [1927, 1959]. See Stevens [1966, 1968] for a survey. Here, the 
stimuli (corresponding to the physical scales in the discussion of earlier 
sections) cannot always be assumed to be on a ratio scale. Indeed, they are 
often only on a nominal scale (Stevens [1966, 1968]). However, when 
stimuli can be measured on what looks like a ratio scale, many of the 
attitude scales seem to be a power function of the stimuli. 

To mention some references, Indow [1961] scaled desirability for wrist 
watches by asking subjects to match desirability of a watch to length of a 
line segment. This scale of preference, it can be argued, is a ratio scale. 
Then Indow had subjects state what they would regard as a fair price for 
each watch. The judged fair price (averaged over the group) turned out to 
be close to a power function of (average) desirability. 
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Other experimenters have had subjects indicate their opinions of the 
prestigiousness of various occupations by adjusting the intensity of a light 
or the level of a sound so that levels observed are proportional to judged 
prestigiousness (Cross [1976)]). 

Sellin and Wolfgang [1964] considered criminal offenses consisting of 
thefts of varying amounts of money, $5, $20, $50, $1000, $5000. They then 
asked subjects, using the magnitude estimation procedure, to scale the 
seriousness of each crime. The judged seriousness (geometric mean of 
group estimates) turned out to be a power function of the amount of 
money stolen, with exponent .17. The exponent suggests that, although it is 
worse to steal $2 than to steal $1, it is not twice as bad, which seems 
reasonable. 

It is possible that the judged seriousness of the crimes in an experiment 
like Sellin and Wolfgang’s could be a measure of the subjective value or 
utility of money. Most economists believe that the utility of money is not 
linearly related to the dollar amount of money, but rather that fixed 
increments become less and less important.* Bernoulli [1738] hypothesized 
that the utility of money is a logarithmic function of the dollar amount. 
Arguments that utility is a power function of dollar amount go back to the 
mathematician Gabriel Cramer in 1728 (Bernoulli [1738]). Some more 
modern arguments to this effect can be found in Stevens {1959b] and 
Galanter [1962]. We return to the utility of money in Chapters 5 and 7. 


Exercises 


1. Show that, according to the data of Table 4.2, if a magnitude 
estimation were performed for the subjective number of repetitions of a 
sound, and the results were plotted in log-log coordinates, the data would 
fit a 45° straight line. 


2. Suppose we know (from data) that the psychophysical function is a 
power function, and we assume that the physical scale is a ratio scale. If 
the psychological scale is either a ratio scale, an interval scale, or a 
log-interval scale in the wide sense, show (by Theorems 4.5 and 4.6 and 
the results of Exer. 10a of Section 4.2) that the psychological scale is either 
a ratio scale or a log-interval scale. 


3. Show that if the two physical scales f, and f, are both interval scales 
and the two corresponding psychological scales g, and g, are both interval 
scales in the wide sense, then the equal sensation graph is a straight line (in 
regular coordinates). 

4. (Pfanzagi [1968, p. 128]) In cross-modality matching experiments, 
Stevens [1959a] reports that if g, = ¥,(f,) and g, = ¥(f,), then yz '(y,) 
turns out to be a power function. 

(a) Show that this result follows if both y, and y, are power func- 
tions. 


*This is the principle known as decreasing marginal utility. 
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(b) Show, however, that it also follows if 
¥(x) =a,Inx +B, i= 1,2. 


5. (Luce and Galanter [1963, p. 291]) An early experiment with painted 
disks performed by Plateau [1872] suggested that equal stimulus ratios 
must induce equal sensation ratios. If true, this observation implies the 
power law. For, suppose y: Re* — Re* is continuous, and for s, s’, t, t’, 


so Ws) _ Hs’) 
i Wa) Wr) 


Letting u(x) = ¥(x)/y¥(1), derive Cauchy’s fourth equation and hence 
show that ¥(x) = ax. 


6. According to Stevens [1960], vibration (of 250 cps) on the finger gives 
rise to sensations that satisfy a power law with exponent .6. The law is 


V(a) = af E(a) |. 


S_ 
t 


Ey 


where E(a) is the physical energy level of the vibration, E, a reference 
energy level, and V(a) the psychological sensation. If the decibel scale 
Ugp(a) is defined by 


Ugp(a) = 10 lt| — | 


show that a 5-dB increase leads to a doubling of sensation. 


4.4 A Measurement Axiomatization for Magnitude Estimation 
and Cross-Modality Matching 


4.4.1 Consistency Conditions 


In this section, we discuss axioms under which magnitude estimation 
leads to a ratio scale. Such axioms would put Stevens’ argument that 
magnitude estimation leads to such a scale on a firm measurement-theore- 
tic foundation—provided the axioms are satisfied. A similar approach 
might help settle a variety of disputes about the scale type resulting from 
empirical procedures. We follow Krantz [1972] and Krantz et al. [197], 
Section 4.6]. Suppose A,, A,...,A, are different physical continua, one 
for sounds, one for lights, etc. Suppose we are performing a magnitude 
estimation on the ith continuum. We fix x; in A; and assign to x, the 
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psychological magnitude p. (We shall confuse x, with its physical magni- 
tude and assume that all magnitudes are positive numbers.) Next, we ask 
the subject to assign to each y, in A; a magnitude q, with q depending on x; 
and p. That is, 


Ni(yj|x; P) = 9 
where N, is the magnitude estimate for y, given that x, is assigned a 
magnitude p. In particular, 

N,(x|%;, P) = P- 


The psychophysical law gives q/p as a function of y,/x,. The power law 
states that 


Ni ¥;|\Xp P) ap(y;/ x)"; 


where a, and £; are constants, a; positive. 

In the variant of magnitude estimation called ratio estimation or pair 
estimation, a pair of stimuli x, andy, from A; are presented, and the subject 
is asked to provide a numerical estimate of the “sensation ratio” of x; to y,. 
We denote this estimate 


P(X;, Y;)- 


In scaling, pair estimates and magnitude estimates are often assumed to 
satisfy the following magnitude~—pair consistency condition: for all z,, p, 


Ni(x;|2; P) 
P(x; y;) = 


> Ni yilZ P) 2) 


Moreover, it is often assumed that pair estimates act like ratios; that is, 
they satisfy the following pair consistency condition: 


P(X; ¥;)* PY, 2) = Pix Z)- (4.33) 
In cross-modality matching, we usually fix x, € A; and x, € A; and say 


they match. We then ask the subject to find a stimulus y, € A; that 
matches a given stimulus y, € A,. We write 


Gil yx, x;) = Vie 


It is often assumed that magnitude estimation and cross-modality match- 
ing are also related, by the following magnitude—cross-modality consistency 
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condition: 


N,(x,z;. P) ~ N, Nole, a) i. a oe) 


That is, if y, is matched with y, in the cross-modality matching, where x, is 
given as matched with x,, then the ratio of the magnitude estimate of y, to 
the magnitude estimate of x, on the ith modality equals the ratio of the 
magnitude estimate of y, to the magnitude estimate of x, on the jth 
modality for any reference estimates p for z, and q for z,. If C,( Y,1X, x;) = 
y; and if x, = z,, then Eq. (4.34) and Eq. (4.32) with / replacing i and y and 
x reversed yield 


WACALSY 2) 


- = Py, x), (4.35) 


using N,(x,|x,,p) = p. Equation (4.35) is a second magnitude—pair con- 
sistency condition. 


4.4.2 Cross-Modality Ordering 


Let us introduce a binary relation > , the cross-modality ordering 
relation, as follows. Suppose x; and y, are in A, and wu, and », are in A,. Then 


(x, ¥) = (4, v,) 


if and only if the sensation ratio of x; toy, is judged greater than or equal 
to the sensation ratio of u, to » ,,. Technically, > is a relation on 


(A, X Ay U A, X AZU +++ UA, X A,). 
We would hope that > is related to the pair estimates as follows: 
(x, ¥) = (u,, v,) = P(x, y;) 2 Pi(u,, v,). (4.36) 


The cross-modality ordering relation > is closely related to the matching 
relation of cross-modality matching. Indeed, suppose ~ is defined from 
= by 


(x91) ~ (up, B) > (x) Hi) = (uy, B) & (uy, B) > (%) yj). 
Then it is reasonable to assume that 


Gil ¥41%;, x) = i> (% x;) ~ (%p %)- (4.37) 
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This assumption says that if x, is matched by x, and y, by y, then the 
corresponding sensation ratios are judged equal. The cross-modality order- 
ing relation > is closely related to the quaternary relation W of algebraic 
difference measurement (Section 3.3). Indeed, by modifying the axioms for 
algebraic difference measurement, Krantz et al. [1971, p. 165] present 
axioms sufficient to prove the following representation theorem (see Exers. 4 
through 9 below for the axioms): 


REPRESENTATION THEOREM FOR CROSS-MODALITY ORDERING. There ex- 
ist functions $;:A,;—> Re* so that for all i, j, and for all x;,y, © A, and 
uy, 0, € A, 


(Xp Vi) Z (Up B) > O(%:)/0() 2 4(4)/( 0). (4.38) 
If $; are any other such functions, there are positive numbers a,,..., O, and 
B so that 
$= 067, 
all i. 


The functions ¢, define psychological scales. Equation (4.38) says that 
one sensation ratio is judged greater than or equal to a second if and only 
if the ratio of the corresponding psychological magnitudes for the first pair 
is greater than or equal to the ratio for this second pair. The representation 
(4.38) is very close to that for algebraic difference measurement. The 
critical axiom needed for the representation theorem is the monotonicity 
axiom: 


[ (x, x)= (yp 9) & (4, z= (oe w)] > CH 2) = (up w). (4.39) 
This is closely related to the monotonicity axiom (Axiom D3) for algebraic 
difference structures (Section 3.3). 


4.4.3, Magnitude Estimation as a Ratio Scale 


Suppose that in addition to the Krantz et al. axioms sufficient for the 
representation (4.38), we assume that the functions N,, P,, and C, satisfy 
the following conditions: 

(a) Pair consistency on the first continuum [Eq. (4.33) for i = 1]: 

P(x, 91) Pip 21) = Paley 21)- 
(b) (Xj. ¥i) Z (uj 0,) > P(%, y)) 2 F(up 9). (4.36) 


(c) Cyl lx; i) =I; > OX) ~ On %): (4.37) 
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(d) Equation (4.35) for j = 1 whenever (y,, x;) ~ (), x)): 


Ny; Xj» P 
(yi x)~ (v1, x)> ake = Pi(y,, X,). 


Then Krantz [1972] proves the following representation theorem: 

REPRESENTATION THEOREM FOR MAGNITUDE ESTIMATION. There is a 
power function @:Re* — Re*, so that if $, are the functions satisfying Eq. 
(4.38), then 


Ni(vilXp P) = 9 & O(¥:)/6(%) = $(9)/o(P), (4.40) 
P(x, ¥,) = 7 = $(x,)/(,) = (7), (4.41) 


and 
Gil ¥|5 x) = i> $(¥,)/$(;) = ;(x,)/$,(%). (4.42) 


Moreover, if 6; and ¢' also satisfy Eqs. (4.38) and (4.40) through (4.42), then 


there are positive numbers o,,..., 0, and B so that 
“= a gf 
and ¢; ad; (4.43) 
¢ = of 


In this theorem we can think of the ¢, and ¢ as psychological scales. Then 
Eq. (4.40) says that our magnitude estimate of y, is qg if and only if the ratio 
of the psychological magnitudes of y, and x; is the same as the ratio of the 
psychological magnitudes of the numbers q and p. 

We shall show that a function N, satisfying Eq. (4.40) defines a ratio 
scale in the wide sense. Thus, the magnitude estimates lead to a ratio scale. 
To proceed, we first rewrite Eq. (4.40). 

Let us fix x, and p, and denote by N,(y,) the number N,(y,|x;, p). It 
follows from Eq. (4.40) that 


$(y;) ae o[ NO) ] 
$;(x;) $[ N(x) ] 


(4.44) 


We shall show that a function N;, satisfying Eq. (4.44) defines a ratio scale 
in the wide sense. Since ¢ is a power function, every similarity transforma- 
tion N/(y,) = AN,(y;) still satisfies (4.44). Conversely, suppose N, satisfies 
(4.44), and we allow admissible transformations of @ and ¢,. Let us see 
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what the corresponding transformation N/ of N,; must be. We have 


| Ni(,) ] 


$;(9;)/$;(%) = #1N@)] 


Using the uniqueness results of Eq. (4.43), we have 
ald(rd]? _ {91 MO)]}" 
a| ,(x;) ] ; {¢[ Ni(x;) ] } $ 

Since a; and £ are positive, we have 


6x) _ MO) 
$;(x;) o[ N/(x;) ] , 


and so by (4.44), 
o[ N.C) as o[ N/()] 
o[ Ni(x,) ] o[ Ni(x%)] , 


Let A be N,(x;)/N,(x,). Since magnitude estimates are assumed to be 
positive numbers, A is positive. We shall see that 


(4.45) 


N/(¥;) = ANC), ally; (4.46) 


Thus, in the wide sense, all admissible transformations of N, are similarity 
transformations. It follows that N, is a ratio scale in the wide sense. To 
demonstrate (4.46), note that since ¢ is a power function, we have from 
(4.45) 


eLNAv))" _ eNO] 
pl Nx,) ]” pl N/(x)]” 
p. > 0. Now we may assume that pv ~ 0, for otherwise ¢(x) is a constant, 


and N,(y,|x; p) = q implies N,(y;|x; p) = q' for any other q’, which is 
nonsense. It follows from p > 0, v ¥ 0, that 


Ny) _ NiG) 


N;(%;) i Ni (x) ° 
Thus, 
NiO) = Fee Mi.) = NOD: 


as desired. 
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Remark: Quite different formal theories of magnitude estimation can be 
found in Green and Luce [1974], Luce and Green [1974 a,b], and Marley 
[1972]. 


Exercises 


1. Suppose loudness is modality i = 1 and the magnitude estimates are 

as in Fig. 4.2. 
(a) Let log y, = 60 and log z, = 80. Suppose N,(y,|x,, p) = 60 and 
N,(z,|x,, p) = 80. If magnitude-pair consistency holds, show that 
P\(¥1 21) are 
(b) Repeat the computation of P,(y,, z,) for the following cases: 
(i) Nix, P) ~ 40, 
N,(2,|x) p) & 50. 
(it) N,(4|%1, P) & 70, 
N,(2,|%,, Pp) = 90. 

2. Suppose modality i= 1 is cold and modality i =2 is force of 
handgrip, and suppose cross-modality matchings are given as in Fig. 4.4 
for fixed x, and x. 

(a) Show that if log y, ~ 10°, C,.(y,|x), x2) is that handgrip force y, 
whose logarithm is ~ 15. 
oO Compute C,,(y3|x3, x2) if modality i = 3 is warmth and if log y, 
ez 10°. 
3. Suppose that for all i, N,(y;|z; p) = ap(y;/z)*, where B, is the 
exponent given in Table 4.2. Assume the conditions (4.32) and (4.36) hold. 
Given the following i, j, x;, y,, and u,, determine v, so that 


(x; 41) ~ (4, v,): 


(a) i= brightness under 5° target. 
j = temperature (cold on arm). 

; = 100. 

200. 

100. 

taste (salt). 

vocal effort. 

100. 

200. 

100. 


4. Show that if Eq. (4.38) holds, then the cross-modality ordering > 
defines a weak order. 


5. Show that if Eq. (4.38) holds, then > satisfies 


x 


g 
= i oe 


SX 


(x, ¥) = (4, v) > (9, 4) = O% %)- 
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6. Show that the monotonicity condition of Eq. (4.39) follows from the 
representation of Eq. (4.38). 


7. Show that the following condition is not a necessary condition for the 
representation (4.38): Given x,, y; © A,, there are x,, y, © A, so that 


(Xj. Yj) ~ (4 ¥1)- 

8. Show that even if the conditions of Exers. 4 through 7 hold, the 
following solvability condition on the cross-modality ordering is not a 
necessary condition for the representation (4.38): If 

(xp yz (Up OY) > (4, 4), 
then there are z; and zy so that 


(x4, 21) ~ (uy, 0,) ~ (27, 9). 


9. Suppose the conditions of Exers. 4 through 8 hold. The elements 
x, x2... xO, ... from A, form a strictly bounded standard sequence 
if 

(xf'*), xf?) ~ (xf, x{?), for all i in the sequence, 
not (x{”, x{?) ~ (xf”, xf”), 
and there exist y; and y/ in A, so that 


(94.91) > (xf?, xf?) > (vy, x1), for all i in the sequence, 


where 
(2), 21) > (wy, w) & ~[(,, wi) = (21 24) ]- 


Show from the representation (4.38) that every strictly bounded standard 
sequence is finite. [Note: Krantz et al. (1971, p. 165] show that the 
conditions studied in Exers. 4 through 9 are sufficient to prove the 
representation (4.38).] 
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CHAPTER 5 
Product Structures 


5.1 Obtaining a Product Structure 


In this chapter we study a variant on the measurement problems we 
have considered so far. Specifically, we consider measurement when the 
underlying set can be expressed as a Cartesian product. In making choices, 
we often consider alternatives with a variety of attributes or dimensions or 
from several points of view. We talk about multidimensional or multiattri- 
buted alternatives. For example, in choosing a job, we might consider 
salary, job security, possibility for advancement, geographical location, 
and so on. In buying a house, we might consider price, location, school 
system, availability of transportation, and the like, In designing a rapid 
transit system, we might consider power source, vehicle design, right of 
way design, and so on. In such a situation, each alternative a in the set of 
alternatives can be thought of as an n-tuple (a), a,,...,4a,), where a, is 
some rating of alternative a on the ith attribute or dimension. For example, 
in the case of a job, a, might be salary, a, might be fringe benefits, a, 
might be some measure of job security (for example, amount of notice 
required), and so on. 

Multidimensional alternatives arise in a different way in economics. If 
there are n commodities in consideration, a, might be a quantity of the ith 
commodity, and (a), @,...,4,) then is a commodity bundle or market 
basket. Preferences are expressed among alternative market baskets. (We 
have previously encountered market baskets in our study of the consumer 
price index in Section 2.6.) 

Let A; be the set of all possible a,. We think of the set of alternatives A 
as a Cartesian product A, X A, X ... XA,, and say that A has a product 
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structure. If each A, is a set of real numbers, we say A has a numerical 
product structure. 

A product structure that is nonnumerical can arise in a variety of ways. 
For example, A, can be a set of possibilities for the ith attribute, and a, can 
be chosen as an element of A;. Thus, in the rapid transit situation, A, might 
be a set of alternative power sources, A, a set of vehicle designs, A, a set of 
right of way designs, and so on. To choose an alternative, we pick one 
member from each A,, that is, one power source, one vehicle design, one 
right of way design, and so on. 

Product structures arise in numerous applications where we are trying to 
explain a response or a dependent variable on the basis of a number of 
factors or independent variables. In studying response strength, psycholo- 
gists often consider two factors, drive and incentive. A given situation is 
defined by some measure d of drive and some measure k of incentive, and 
the set of situations corresponds to the set D x K, where D is a set of 
different (levels of) drives and K a set of different (levels of) incentives. 
Sometimes habit strength is also considered. In that case, if H is a set of 
strengths of habits, one considers situations corresponding to ordered 
triples in the set D X K X H. 

In studying binaural loudness in auditory perception, psychologists 
present sounds of different intensity to each ear. The set A of alternative 
stimuli is L X R, where L is the set of sounds (sound intensities) presented 
to the left ear and R the set presented to the right ear. 

In a mental testing situation, it is important to study the interaction 
between a subject and an item on a test. If S is the set of subjects and T is 
the set of test items, then S X T is the set that is often studied. 

In studying discomfort under different weather conditions, the factors 
temperature ¢ and humidity A play a principal rule. Relative discomfort is 
considered under various combinations of temperature and humidity. If T 
is the set of temperatures of interest and H the set of humidities, then 
T X H is the set of alternative weather conditions that can be compared 
as to discomfort. 

Often a product structure will be presented ahead of time. However, 
especially if we are out to calculate a utility function, the first step is often 
to structure the set of alternatives A. In this section, we make some 
remarks on how to find a product structure. In the next section, we discuss, 
in the context of preferences, how to reduce one product structure to 
another with fewer dimensions, an important reduction in practice. In 
Section 5.3, we consider sets of alternatives with numerical product struc- 
tures, in particular sets of alternative commodity bundles, and discuss the 
calculation of ordinal utility functions over sets of alternatives with such 
product structures. In Section 5.4, we seek functions (utility functions and 
other order-preserving functions) that can be calculated by reducing the 
computation to each dimension separately, and then adding. In Section 
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5.5, we continue the discussion of reduction of computation to dimension 
by dimension computation, but consider ways other than addition of 
combining results from different dimensions. In Section 5.6 we consider 
quite different measurement problems, where there are two dimensions 
and the set of individuals making judgments is one of the dimensions. 
Throughout the chapter, we shall keep in mind the applicability of the 
results to the drive and incentive problem, the binaural loudness problem, 
the mental testing problem, and the temperature—humidity problem. A 
variety of other applications of product structures, and in particular of 
utility functions over product structures, is described in Keeney and Raiffa 
[1976]. See also Farquhar [1977] and Cochrane and Zeleny [1973] for 
surveys of multidimensional utility theory. Other applications of utility 
functions over product structures include decisionmaking about educa- 
tional priorities, choice of air pollution abatement strategies, development 
of water quality indices, choice of medical treatments, and siting for major 
new facilities such as airports. See Section 7.3.1 for references. 

The question of how to introduce a product structure on A is a very 
important one, and one for which there is no precise theory. There are, at 
best, rules of thumb. Often, we shall simply be given relevant dimensions, 
and there is nothing to do. However, the problem we consider briefly is 
this: Given an unstructured set of alternatives A, how do we give it a 
product structure? The first step is to define the set of aspects or dimen- 
sions. Perhaps the most natural procedure for doing this in many de- 
cisioninaking contexts is that described by Manheim and Hall [1968], 
Miller [1969], and Raiffa [1969]: build up the structure hierarchically. 
Namely, start by listing an inclusive set of first-level attributes or objec- 
tives or facets. Then subdivide each of these into an inclusive list of 
second-level objectives, more precise than these. And so on. For an 
extensive discussion of the problem of introducing a product structure, see 
Keeney and Raiffa [1976, Chapter 2]. 

To illustrate this hierarchical procedure, we present the Manheim—Hall 
hierarchical structuring of attributes useful in comparing alternative trans- 
portation systems in the Northeast Corridor of the United States. The 
“super-goal” Manheim and Hall begin with is “The Good Life.” They 
subdivide this goal into four dimensions: convenience, safety, aesthetics, 
and economic considerations. Each of these dimensions is further subdi- 
vided. And so on. The subdivisions are shown in a “tree diagram” in Fig. 
5.1. The boxes at the bottom end of each branch represent the final 
collection of dimensions; there are twenty in all. 

Once a final collection of dimensions or attributes has been obtained, 
these define a product structure. The next step is often to obtain some 
numerical assignment a, = f(a) for each alternative a € A on each dimen- 
sion i. That is, we want to scale each element on each of the final list of 
dimensions, such as travel time, probability of delay, and noise. These 


200 


“THE GOOD LIFE” 
(a) | CONVENIENCE (b) | SAFETY (c) | AESTHETICS 


DECREASE USER NON-USER 
DECREASE DECREASE| |PROPERTY 
FATALITIES INJURIES DAMAGE 


o) Oo © 


(7) (10) 


TRAVEL| |PROBABILITY 
TIMES OF DELAY 


() (2) 


(8) (1) 


(9) 


ECONOMIC 
(d) CONSIDERATIONS 
SOCIOECONOMIC 
DOLLAR COST TPACTS 
OPERATING AND REGIONAL SOCIOECONOMIC 
CONSTRUCTION MAINTENANCE GROWTH CLASSES AFFECTED 
PATTERNS BY SYSTEM 


amMyNNS Onpolg & ZUIUTEIqO ['S sommoNNg ONpolg 


(18) (20) 


TERMINALS GUIDEWAY TERMINALS GUIDEWAY REGIONAL 
GROWTH 


(12) (14) (15) (17) 
(19) 


(13) (16) 


Figure 5.1. Manheim—Hall hierarchical structuring of attributes for comparison of alternative 
Northeast Corridor transportation systems. Adapted from Raiffa [1969, pp. 19, 20] with 
permission of the RAND Corporation. 
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scales are not necessarily utilities—they are simply numbers that translate 
the product structure into a numerical product structure and make it easier 
for us to define our preferences among them. Some of the scaling will be 
relatively easy, for example, measurement of travel time. But a good deal 
of progress in measurement will have to be made for this scaling to be 
carried out in general. In the case of noise, for example, there is a whole 
proliferation of possible noise measures in use (cf. Kryter [1970] and our 
discussion of loudness scaling in Chapter 4). It is not clear which of these 
is the most satisfactory. The comfort dimension will be even harder to 
scale, and if the hierarchical structuring procedure is to lead to a numerical 
product structure, this will have to be scaled as well. Considering these 
difficulties, perhaps it is best to simply create a product structure, but not 
translate it into a numerical one. Measurement—of utility for example— 
can still proceed, as Section 5.4 illustrates. 


Exercises 


1. Suppose f(a) is a rating of alternative a on the ith dimension. 
(a) Show that even if each f, defines a ratio scale, the statement 


1 l 
= SSila) > = E40) 


is meaningless. (Cf. the discussion in Section 2.6.) 
(b) Consider the meaningfulness of the statement 


l 1 

— 2 fla)>— ZF, 

n sox? m be yi), 
where n = |X| and m = |Y|. 

2. An interesting question in measurement theory is the following: 
Suppose someone expresses or exercises preferences (on an unstructured 
set of alternatives). Can we judge if he is acting as if he had a product 
structure? One way to formalize this is the following: Suppose R is 


preference on the set A. Are there real-valued functions f,, f,,...,f, on A, 
some n, so that for all a, b in A, 
aRb = (Wi) f(a) > f(b) }. (5.1) 


That is, is the person acting as if he measures “preference” on each of a set 
of dimensions, and expresses preference for an alternative if and only if it 
receives a higher score on each dimension? The theory of dimension of 
strict partial orders (Section 1.6, Exers. 12 through 18) is relevant to this 
problem. Show the following: 

(a) If there are functions fi, f,,.“.,f, satisfying Eq. (5.1), then 
(A, R) is the intersection of n strict weak orders (A, R,) and (A, R) is a 
strict partial order. 
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(b) If A is finite and (A, R) is the intersection of a set of n strict weak 
orders on A, then (A, R) is the intersection of a set of n strict simple orders 
on A, provided n > 1. Hence, if A is finite and there are functions 
Sify ..-»f, Satisfying (5.1), with n > 1, then the dimension of the strict 
partial order (A, R) is at most n. 

(c) Conversely, if A is finite and (A, R) is a strict partial order of 
dimension at most n, n > 1, there are functions ff, f,,...,f, satisfying 
(5.1). 

(d) Hence, for every strict partial order on a finite set, there are 
functions f,, f.,..., f, satisfying (5.1), for some n. (That is, every person’s 
preferences can be given a product structure, provided they satisfy the 
strict partial order assumptions.) The smallest n for which there are such 
functions is the dimension of the strict partial order (A, R), except that 
some two-dimensional strict partial orders can be represented in the form 
(5.1) with n = 1. 

For a further measurement-theoretic discussion of dimension of strict 
partial orders, see Baker, Fishburn, and Roberts [1971] and Roberts [1972]. 

Note: Variants of the representation (5.1) are of interest in measure- 
ment theory. We might ask for functions f, such that 


aRb = [f,(a) > f,(2)] or [fi(a) = f(b) & f(a) > f,(2)] or ... 
or [f\(a)= f(b) & ... & f,_,(a) =f,_1(b) & f,(a) > f,(d))- 


This is a lexicographic representation. In the footnote on page 269, we 
shall mention a representation that has as a special case 


aRb & (Vi)[|f(a) — (6)| <8). 


These are all representations into a product structure rather than into the 
reals, with R corresponding to some relation on the product structure. Not 
much progress has been made in studying any of these representations. 
However, they are potentially very useful. 


5.2 Calculating Ordinal Utility Functions by Reducing the 
Dimensionality 


In this chapter, we shall study a binary relation R on a set of alternatives 
A which has a product structure. Often R will be interpreted as preference. 
However, sometimes it will have the interpretation “responds more 
strongly than,” “sounds louder than,” “is scored higher than,” etc. We 
shall use the notation 


aSb <= ~ bRa (5.2) 
and 


aEb = ~ aRb & ~ bRa. (5.3) 
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If R is strict preference, then S is weak preference and E£ is indifference.* 
In this section, we mention some tools for calculating a function f on A 
which preserves the relation R, that is, a function f on A so that for all 
(a), a,,..., a@,) and (5, b,..., 5,) in A, 


(a), a,,...,4,)R(B,, 5, ..., b,) > f(a, a,,...,a,) >f(b;, by, ..., By). 
(5.4) 


In case R is preference, f is an ordinal utility function. In Section 5.4, we 
shall ask more of our utility function, namely that it be additive, that is, 
that there be real-valued functions f, on A,, f, on A2,..., up tof, on A, 
so that 


f(a, a, --. A) = fi(ay) + f(a) + +++ +Ff,(4,) (5.5) 
or 


(a, a),...,a,)R(b,, by,..., b,) 
= (5.6) 
f(a.) + flan) + +++ +fi(a,) > f(b) + fx(b2) +--+ +F,(8,)- 


If such functions f, exist, then we may calculate utility by calculating it 
separately on each component, and adding. In Section 5.5, we shall again 
ask for an f that can be calculated separately on each component, but we 
shall not assume that it is obtained from the individual component values 
by addition, but rather by other “composition rules.” 

There are times when the number of dimensions used in a decision 
problem might number in the thousands.+ In order to calculate a utility 
function f, it is helpful to try to reduce the number of dimensions. 
Sometimes a simple technique works. We illustrate it first in the two-di- 
mensional case, A = A, X A,. Fix an element y* in A,. (This could be the 
“best” or “worst” possibility in 4,, for example 0.) Given a = (aj, a,) in A, 
find x = (a) in A, so that aE(x, y*), where E is indifference and is 
defined by Eq. (5.3). Then, assuming that (A, R) is a strict weak order, for 
a = (aj, a,) and b = (5), b,), we have 


aRb = [ (a), y* |] R[ 7(b), »*]. 


The alternatives of the form (x, y*) for fixed y* are essentially one-dimen- 
sional, and so we have reduced our problem of calculating a utility 


*Put another way, aRb iff a is “better than” 5, aSb iff a is “at least as good as” b, and aEb 
iff a and b are “equally good.” 
+See Dole et al. [1968] for an example. 
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function on two-dimensional alternatives to the problem of calculating a 
utility function over one-dimensional alternatives. Of course, this proce- 
dure only works if for every a, there always is a z(a). 

A similar reduction technique works for higher dimensions. We present 
an example from medical decisionmaking due to Raiffa [1969]. In consid- 
ering the results of alternative medical treatments, we might consider the 
following dimensions: 


a, = amount of money spent for treatment, drugs, etc. 


a, = number of days in bed with a high index of discomfort. 
a, = number of days in bed with a medium index of discomfort. 


a, = number of days in bed with a low index of discomfort. 


1 occurs, 

as = if complication A 
0 does not occur. 
1 occurs, 

ag = if complication B 
0 does not occur. 
1 occurs, 

a,= if complication C 
0 does not occur. 


If a,, ag, a5, ag, and a, are kept fixed, and a, is changed to 0, let us ask 
what value of the third component will compensate for this change, i.e., for 
what value a,’ is 

(a), az, a3, A4, As, Ae, Az) 
judged indifferent to 


(a,, 0, a3’, a4, As, ag, a7). 


That is, we think about how many days at medium level of discomfort we 
would trade for a given number of days at high discomfort, all other things 
being equal. In the same way, we find a,” and a,’ so that 

(a,, 0, a3’, a4, As, Ag, 27) E(a,, 0, a,”, 0, a5, a6, a7) 
and 


f 
(a,, 0, a3”, 0, as, ag, a7) E(0, 0, a3’, 0, a5, ag, a7). 


We might think of a,’” as representing a certain number of days in bed 
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with medium discomfort which corresponds to the vector (@,, a), a3, a4). In 
any case, we have reduced from seven dimensions to four dimensions. In 
Section 7.3.1, we shall show how for this example one can reduce to two 
dimensions. For recent work on the use of indifference judgments to 
reduce dimensions and otherwise simplify the assessment of multidimen- 
sional utility functions, see MacCrimmon and Siu [1974], MacCrimmon 
and Wehrung [1978], or Keeney [1971, 1972]. 


Exercise 


(Raiffa [1969]) Suppose (x,, x.) represents an amount of cash (say 
salary) incoming in two time periods, | and 2. Money incoming in period | 
can be reinvested to produce more money. Also, consumption now is 
(often considered) sweeter than consumption later. Thus, we might find a 
constant A so that (x,,x,) is judged indifferent to (x, + Ax,, 0). The 
number A can be thought of as the (subjective) discount rate. Find an 
(explicit) utility function over the set of pairs (x,, x). (Note: The oversim- 
plification here is that the discount rate A is constant. In general, we will 
have 


(x1, X2)E[ x, + A(x, X2)X2, 0], 


where A(x,, x2) is a variable discount rate. Since we can usually get a 
greater return on investment for large amounts invested, A(x), x2) might 
decrease for fixed x, as x, increases.) 


§.3 Ordinal Utility Functions over Commodity Bundles 


Suppose as in the previous section that R is a binary relation on a set of 
alternatives A which has a product structure. In this section, we shall 
usually think of R as strict preference. We shall seek conditions on (A, R) 
sufficient for the existence of an ordinal utility function, a real-valued 
function f on A which satisfies 


(a, a,...,4,)R(b,, b,,..., b,) > f (ay a,...,4,) > f(b, b,...,5,). 
(5.4) 


Of course, the Birkhoff-Milgram Theorem (Theorem 3.4, Corollary 1) 
gives conditions on (A, R) necessary and sufficient for the existence of 
such a function f. But we shall want conditions that are more special to a 
product structure. 

We state a representation theorem for the representation (5.4) for the 
case where A has a numerical product structure, that is, each A, is a set of 
real numbers.* For simplicity, we assume that A = Re”; that is, we allow 


*We follow Luce and Suppes [1965, p. 259], who credit Wold and Jureen [1953] and 
Uzawa [1960]. 
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all numerical values on each dimension. Essentially the same theorem 
holds if each component of the product is a real interval rather than all of 
Re. 

Suppose a = (a), a,...,4,) and b = (b,, b,,..., 5,) are two elements 
of A. We say that a 2 b if a, 2 6, for all i. We say that a > b if a 2 b and 
a *b, that is, a, 2 6, for all i and a, > 6, for some i. Also, a + b is the 
vector 


(a, + b,, a, + b,,...,a, + 5,), 
and if A is a real number, then Aa is the vector 
(Aa,, Adz, ..., Aa,). 


The relation (A, R) is said to satisfy the dominance condition if, 
whenever a>b, then aRb. (We have already encountered a related 
condition in Section 1.6.) In the case of preference, not all scaling proce- 
dures lead to product structures satisfying dominance. For example, in the 
case of the Northeast Corridor, let us consider travel time a, associated 
with a particular transportation mode. You certainly like small a, better 
than large a,. This can of course be remedied by using 1/(travel time) as 
the value a,. But there are more serious problems. If you live very close to 
your relatives—that is, if a, is small—you can visit often and with little 
expense. If a, is in the middle range, say 6 hours or so by car, your 
relatives still expect to see you often, but it is now quite an expense to 
make lots of trips and it is quite time-consuming, for a visit is worthwhile 
only if it lasts several days. If a, is large, say 18 hours by car, you cannot 
visit often, maybe just once or twice a year. But then annually, travel 
expenses are not so bad. In short, it is quite conceivable that you will like 
small a, best, large a, next best, and moderate a, least. Another example is 
due to Miller [1969]. Consider the problem of scratching an itchy portion 
of skin. “For a while, continued scratching is preferable to discontinued 
scratching; but if the scratching process is continued too long or too 
intensively, it is preferable to scratch more lightly or to discontinue the 
process altogether.” Thus, for example, if a, is “time spent scratching,” 
then moderate a, is preferred to small a,, but large a, is worst of all. As 
Keeney and Raiffa [1976] point out, a similar situation arises with the level 
of blood sugar in the body. Below a “normal” level, higher blood sugar 
levels are preferred. Above a “normal”’ level, lower levels are preferred. 

There are many examples of numerical product structures that do seem 
to give rise to preference relations satisfying dominance. An example is 
where A is a set of alternative commodity bundles (market baskets): there 
are n products and aq, is the quantity of the ith product in your bundle. 
Then it seems reasonable to assume that preference satisfies the dominance 
condition over commodity bundles. Having 4 lamps is better than having 
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2, having 10 lamps is better than having 4, and having 150 lamps is better 
than having 10. This represents a fundamental assumption of economics: 
you can never be satiated with a good. Thus, the dominance condition is 
sometimes called nonsatiety or nonsaturation. The lexicographic ordering 
on the plane (Section 3.1.4) is another example of a binary relation 
satisfying dominance. However, even though lexicographic preference is 
also a strict weak order (even strict simple), we know that there is no 
ordinal utility function (Section 3.1.4). Thus, we shall have to add still 
another assumption. 
To state a representation theorem, we define E on A by Eq. (5.3). 


THEOREM 5.1. Suppose A = Re" and R is a binary (preference) relation 
on A. Then there is a (utility) function f on A satisfying 


aRb @ f(a) > f(b) (5.7) 


whenever (A, R) satisfies the following conditions: 

(a) (A, R) is a strict weak order. 

(b) (A, R) satisfies the dominance condition. 

(c) (Continuity): If aRb and bRe, then there is a real number ) such that 
OSA S1 and 


[Aa + (1 — Aje]Eb. 


To understand the third condition, let us observe that Aa + (1 — A)cis a 
point on the straight line joining a and c. In the case of preference, the 
assertion is that the “indifference curve” of all elements to which element b 
is judged indifferent is continuous and hence intersects the straight line 
joining a and c. (See Fig. 5.2.) Theorem 5.1 says that we can assume that 
there is an ordinal utility function so long as preferences satisfy the 
conditions (a), (b), and (c). We have already discussed the dominance 
condition. The continuity condition is in fact similar to the condition in 
the Birkhoff-Milgram Theorem which asserts the existence of a countable 
order-dense subset. However, the continuity condition has a simpler eco- 
nomic interpretation. 

To prove Theorem 5.1, let us say that a is a diagonal vector if a; = aj, all 
i. We show that for all a there is a unique diagonal a* such that aEa*. 
Assuming that this is true, we shall define f as follows. If a is a diagonal 
vector, let f(a) = a,. If a is not a diagonal vector, let f(a) = f(a*). We 
show that 


f(a*) > f(b*) = a* > b*, (5.8) 
a* Rb* = a* > b*, (5.9) 

and 
a* Rb* = aRb. (5.10) 
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Indifference curve 


Figure 5.2. The continuity axiom. 


These three conditions imply (5.7). Now, 
f(a*) > f(b*) = (a*), > (b*); 
= a* > b*, 
which proves (5.8). Using dominance we find that 
a* > b* = a* Rb*. 
Next, 
~ (a* > b*) = [b* > a* or b* = a*] (since a*, b* are diagonal) 
=> [b*Ra* or b* = a*] (by dominance) 
=> [b* Ra* or b* Ea*] (R is irreflexive, since it is 
strict weak) 


=> ~a*Rb* (since R is strict weak). 


This proves (5.9). Finally, aEa* and bEb*, so it is easy to show (5.10), 
using the properties of a strict weak order. 

It is left to prove that for all a in Re” there is a unique diagonal a* such 
that aFa*. We first prove that there is an a*. To prove this, let a,, = 
min {a,}, let a, = max{a,}, and let a’ =(a,,a,,...,4,,) and a” = 
(dy. Au> +++ Ay). If a= a, let a* =a’. If a” =a, let a* =a”. If aF¥ 
a’, a”, then by dominance a” RaRa’. By continuity, there is A € [0, 1] such 
that 


aE[Aa’ + (1 — A)a”]. 
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Take a* to be Aa’ + (1 — Aja”. Finally, if a** is any other diagonal vector 
such that aEa**, we have a*Ea & aEFa**. Since (A, R) is a strict weak 
order, E is transitive (by Theorem 1.3) and a* Ea** follows. We conclude 
by dominance and the fact that a* and a** are diagonal that a* = a**. 
This completes the proof of Theorem 5.1. 


Exercises 


1. Show that the lexicographic ordering of the plane satisfies conditions 
(a) and (b) of Theorem 5.1, but not condition (c). 
2. Which of the conditions of Theorem 5.1 are satisfied by the following 
binary relations R on Re”: 
(a) aRb iff a>b. 
(b) aRb iff La, > Xb. 
(c) aRb iff Ta, > IIb, 
(d) aRb iff a, >b,. 
(e) aRb iff max a, > max b,. 
3. (a) Show that under the hypotheses of Theorem 5.1, if aRb and bRe, 
then there is exactly one A, 0 SA <1, so that 


[Aa + (1 — A)e] Eb. 


(b) However, show that if aSb and bSc, where S is defined by Eq. 
(5.2), there may be more than one A, 0 SA < 1, so that 


[Aa + (1 — A)c]Eb. 


4. Which of the conditions (a) through (c) are necessary for the repre- 
sentation (5.7)? 

5. Develop an alternative proof of Theorem 5.1 by showing that under 
conditions (a) through (c), (A*, R*) contains a countable order-dense 
subset, and then applying Corollary 1 of Theorem 3.4. (This method of 
proof is presented by Fishburn [1970, p. 32].) 

6. We prefer the present proof of Theorem 5.1 because it gives a 
constructive proof that conditions (a) through (c) guarantee the existence 
of a continuous utility function, using the usual topology on Re” and Re; 
for the function constructed is continuous. Show this. Cf. Exer. 22a, 
Section 3.1. 


5.4 Conjoint Measurement 


5.4.1 Additivity 


Suppose a set of alternatives A has a product structure, ie., A is 
A, X A, X +++ XA,. If we want to calculate utility of elements of 
A, it would be easier to calculate utility on each attribute 
separately, and then add. Specifically, we would like to express an 


5.4 Conjoint Measurement 211 


ordinal utility function f:A — Re as a sum of real-valued functions 
Softy -..+f, On Ay, An,...,A,, Tespectively. That is, we would like 


f(@y, ay, .. a) = f(a) + fo(a.) +++ +f,(4,). (5.5) 


A utility function f for which there are functions f, satisfying (5.5) is called 
additive. More generally, we would like to calculate utility on each attrib- 
ute separately and then combine in some way. That is, we would like to 
find real-valued functions f,, f,,...,f, on A;,Az,...,A,,. respectively, 
and a function F: Re” — Re such that 


f(a, Cr a,,) = FL f,(@;), fo( 42), ae aACAIE (5.11) 
In this section, we consider the special case 
F(X4, Xq) 02 Xpq) = Xy HX HO HX, 


that is, the representation (5.5). We consider other examples of composi- 
tion rules or functions F in Section 5.5, 

Suppose R is the binary relation of strict preference on A. Then we seek 
real-valued functions f; on A, so that for all a =(a,,a,,...,a,) and 
b = (5,, b,,..., 5,) in A, 


aRb = f,(a,) + fp(a.) +--+ > +F,(4,) > f(b) + f(b.) +--+ > +F,(5,)- 
(5.12) 


Technically, the representation (5.12) does not fit the general framework 
for fundamental measurement, as described in Section 2.1. We are not 
seeking a homomorphism from one relational system to another. However, 
we shall treat the representation (5.12) in the same spirit as the representa- 
tions studied in Chapter 3, and seek (necessary and) sufficient conditions 
for the existence of functions f, satisfying (5.12). The representation (5.12) 
is often called (additive) conjoint measurement, because different compo- 
nents are being measured conjointly. 

Conjoint measurement has potential applications in areas other than 
utility. In studying response strength, let D be a set of drives and K a set of 
incentives. Let R be a bmary relation on D X K, with (d,, k,)R(d,, k,) 
interpreted as follows: if drive d, is coupled with incentive k,, then the 
response is stronger than if drive d, is coupled with incentive k,. We seek 
functions 6:D > Re and «:K — Re so that for all d,, d, © D and k,, 
k, € K, 


(d,, k,) R(d,, kz) <=> 8(d,) + x(k,) > 5(d,) + K(k). (5.13) 


If we bring in habit strength, we have a third set H of habit strengths, R is 
a binary relation “responds more strongly than” on D X K X H, and we 
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seek functions 5, x, and yu: H > Re such that for all d,, d, € D, k,, k, © K, 
and h,, h, € A, 


(4), ky, 41) R(d,, ka, hy) <> 8(d,) + K(k) + u(y) > 8(d2) + K(k) + w(h,). 
(5.14) 


In studies of binaural loudness in auditory perception, let & be a set of 
sounds presented to the left ear and & be a set of sounds presented to the 
right ear. Let R be a binary relation on & xX &, with (4, 7,)R(Q, 72) 
interpreted as follows: If sounds (, and r, are presented simultaneously to 
the left and right ear, respectively, the result is judged louder than if 
sounds (, and r, are presented simultaneously to the left and right ear, 
respectively. Then, assuming the effects to the two ears are additive,* 
psychologists seek functions ¢:& > Re and r: — Re so that for all 
6,0, €L andr, € KR, 


(0), 71) R(b, 72) <> C(O) + r(71) > OC) + r(7). (5.15) 


In mental testing, to study the interaction between a subject and an item 
on a mental test, suppose S is a set of subjects and T a set of items. Let R 
be a binary relation on S X T, with (s,, t;)R(s2, t,) interpreted to mean 
that subject s, scores better on item ¢, than subject s, does on item f,. It is 
tempting to try to assume that subjects and items are independent, and 
that the score of a subject depends only on his ability and the difficulty of 
the item, and not on how hard the item is for him personally. Then, we 
would like ability and difficulty functions a:S — Re and 6:T — Re so that 
for all s,,5. € S and t,,t € T, 


(51, t)) R(52, th) > a(5,) + 8(t)) > a(sz) + (4). (5.16) 


Finally, in studying discomfort under different weather conditions, 
suppose T is a set of temperatures being studied and H a set of humidities. 
Let R be a binary relation on T X H, with (¢,, h,) R(t, h2) interpreted as 
follows: A subject is more uncomfortable at temperature ¢, and humidity 
h, than at temperature ¢, and humidity h,. Then we seek to measure 
discomfort by a discomfort index or a temperature—humidity index. If the 
temperature and humidity effects can be separated and added, then we can 
find functions 7:T— Re and y:H — Re so that for all ¢,,4, € T and 
h,, h, © A, 


(t), Ay) R(t, hp) <> 7(t)) + y(hy) > 1(t2) + y(A2)- (5.17) 
The number 7(¢) + y() is the discomfort index. 


*This assumption, sometimes known as the loudness summation hypothesis, has been 
subjected to extensive discussion. See footnote f, p. 154 for references. 
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We shall seek conditions (necessary and) sufficient for the existence of 
functions satisfying representations like (5.12) through (5.17). Concentrat- 
ing on (5.12), let us take n = 2 and give an example. Suppose A, = {a, B} 
and A, = {x,y}, and suppose that preference is a strict simple order in 
which (a, x) is strictly preferred to (a, y), which is strictly preferred to 
(B, x), which is strictly preferred to ( 8, y). Then an ordinal utility function 
J is given by 


f(a, x) = 3, f(a,y) = 2, S(B, x) = 1, f( By) = 0. 


The function f is additive, for we may take 
f(a) = 2, fi(B) = 0, f(x) = 1, fly) = 0. 


Then f(a,, a,) = f,(a,) + f,(az). Also, of course, the functions f, and f, 
satisfy (5.12), for 


fila) + f(x) > file) + Ay) > ACB) + A(x) > ACB) + AY). 


Using the same A, and A,, suppose ( 8, x) is strictly preferred to ( B, y) and 
(a, y) is strictly preferred to (a, x). Then no additive conjoint representa- 
tion exists. For if one did, then (8, x) R( 8, y) implies 


fi(B) + A(x) > fiCB) + AY), 
so f.(x) > f,(v). However, (a, y)R(a, x) implies 
fila) + Ay) > Ala) + A(x), 


so f,(y) > f,(x), a contradiction. 

Conditions sufficient for additive conjoint measurement were first pre- 
sented by Debreu [1960]. Some of his conditions were topological in 
nature. Algebraic sufficient conditions in the spirit of those in Chapter 3 
were first presented by Luce and Tukey [1964]. More refined conditions 
can be found in Krantz et al. [1971, Chapter 6]. In Section 5.4.3, we 
present these conditions. In Section 5.4.5, we present necessary and 
sufficient conditions for additive conjoint measurement in the case where 
each A, is finite. Without the assumption of finiteness, presentation of such 
necessary and sufficient conditions is still an open problem. 

Before leaving this subsection, we should remark that, in spite of 
historical prejudice, there is nothing magic about addition as opposed to 
some other operation. For, just as in the case of extensive measurement, if 
fifa. .+5f, Satisfy Eq. (5.12), then g, = e”, 2, = e%,...,9, =e, are 
positive and satisfy 


aRb = 2,(4,)82(42) . . . 8,(a,) > 81(b1)82(2) --- Bn(B,). (5.18) 
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Conversely, if g,,22,...,8, are positive and satisfy Eq. (5.18), then 
f, =ing,,f, =Ing,,...,f, = Ing, satisfy Eq. (5.12). The multiplicative 
representation (5.18) is often more useful than the additive one. For 
example, in physics, momentum has a multiplicative representation, as 
mass times velocity. 


5.4.2 Conjoint Measurement and the Balance of Trade* 


As an aside, we present in this subsection an amusing application of 
additive conjoint measurement, which is based on an idea of David Gale.t 
Imagine that there are two countries, each of which produces just one 
product. In Country A, the average individual receives 4 units of this 
product (measured, for example, in dollars) in his lifetime, split evenly into 
2 units in the first half of his life and 2 units in the second half of his hfe. 
In Country B, the average individual also receives 4 units of the product in 
his lifetime, but | in the first half and 3 in the second half. Now individuals 
in Country B live a “deprived” youth and a “wealthy” old age. They might 
be willing to trade some of the wealth they expect in their old age for more 
in their youth. Individuals in Country A might be willing to part with a 
small portion of the wealth they receive in their youth, if they are 
compensated in their old age, with interest, thus achieving a lifetime 
increase in wealth. 

In general, if a represents the anount of the product obtained in the first 
half of an individual’s life and b represents the amount obtained in the 
second half, a pair (a, b) represents a possible distribution.? According to 
our argument, individuals in Country B might prefer (4/3, 5/2) to (1, 3) 
and individuals in Country A might prefer (5/3, 5/2) to (2, 2). Even if 
there is no unequal treatment of the two time periods, these preferences 
can be accounted for by an additive utility function, namely 


F(a, b) = logio a + logy b. 


For then {(4/3, 5/2) = .523, f(1, 3) = .477, f(5/3, 5/2) = 620, and 
(2, 2) = .602. (Such a utility function is not unreasonable, at least if a is 
measured in dollars. The idea that utility of a dollars is a logarithmic 
function of a goes back to Daniel Bernoulli (Stevens [1959, p. 47]). We 
discussed this point in Section 4.3.4 and return to it in Chapter 7.) 

This highly idealized example is used by Gale to argue that a regular 
negative balance of trade is not necessarily a bad state of affairs. The 
average individual in Country B trades a (1, 3) distribution for a 
(4/3, 5/2) distribution, thus having a negative balance of trade of 


*This subsection may be omitted without loss of continuity. 
tPresented to the American Association for the Advancement of Science, January 1975. 
*Compare the exercise of Section 5.2. 


5.4 Conjoint Measurement 215 


1+3-—4%-$=4 units. If it is assumed that both populations remain 
fixed, Country B can have a perpetual negative trade balance, and yet be 
happy with it. 


5.4.3 The Luce-Tukey Theorem 


In this section, we present the Krantz et al. [1971] refinenient of the 
Luce-Tukey theorem, which gives sufficient conditions for additive con- 
joint measurement. We shall consider the case n = 2, namely the repre- 
sentation 


(4), a2) R(b,, by) = f,(a,) + (a2) > f,(b1) + f(4>). (5.19) 


We begin by stating some necessary conditions. If Eq. (5.19) holds, (A, R) 
must be a strict weak order. We shall want to assume that. If a and b have 
the same second component, then it follows from Eq. (5.19) that whether 
or not aRb holds does not depend on the second component. That is, for 
all x,y € A, and q,r € A, 


(x, g)R(y, 9) = (x, r)R(y, 7). (5.20) 
For, we have 
f(x) + (9) > AiO) + ACD) ofA) >A) 
= f(x) + Ar) >A) + A”). 
Similarly, for all x, y € A, and q,r € A, 


(x, q)R(x, r) = (y, RC, 1). (5.21) 


We say that a binary relation (A, < A, R) satisfies independence on the 
Jirst component if (5.20) holds for all x, y € A, and p, g € A>, independence 
on the second component if (5.21) holds for all x, y € A, and p, q € A,, and 
independence if both of these conditions hold. 

We once again define the relations S and E from R by Eggs. (5.2) and 
(5.3), respectively: 


aSb <= ~ bRa (5.2) 


and 
aEb = ~ aRb & ~ bRa. (5.3) 


The reader will recall that if R is strict preference, then S is weak 
preference and E is indifference. A binary relation (A, X A,, R) satisfies 
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the Thomsen condition if, for all x, y, z © A, and q,r,s © A, 
(x, s)E(z, r) & (z, Q)E(y, 5) = (x, Q)E(y, 1). 
The Thomsen condition also follows from the representation (5.19). For, 
(x, s)E(z, r) = fi(x) + Als) = f(z) + A) 
and 
(z, JE(y, 5) f(z) + A(D = Ai) + Als). 


By adding the right-hand sides and canceling, we see 


Fix) + AD = fi) + AC), 


so 


(x, q)E(y, r). 
This condition will be more clearly understood if we define a binary 
relation D on A, X A, by 
(x, p)D(y, 4) = (x, Q)EQ, p). 


Then the Thomsen condition says that D is transitive. 

We mentioned in Section 3.2 that most measurement axiomatizations 
have some form of Archimedean condition. The next condition we in- 
troduce is the Archimedean condition. To do this, we need to introduce 
some notation. A binary relation R on a product A, X A, induces binary 
relations R, and R, on the components A, and A,, respectively, as follows: 


xR,y = (4g € A2)[ (x, RU, 9)] (5.22) 
and 
qRyr = (Ax € A,)[(x, g)R(x, r)]. (5.23) 


It is easy to see that if (A, X A>, R) is a strict weak order and satisfies 
independence, then (A,, R,) is a strict weak order for i = 1, 2. The proof is 
left to the reader. We may define S, and E, from R; by equations analogous 
to (5.2) and (5.3), namely by 

xSiy @ ~ yR,x (5.24) 


and 


xEy = ~ xRy & ~ yR;x. (5.25) 
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If R is strict preference, then R, is an induced strict preference relation on 
the ith component, S, is weak preference on the ith component, and E£; is 
indifference on the ith component. 

To state our Archimedean condition, we introduce the notion of a 
standard sequence, a sequence of equally spaced elements on one of the 
components. This is a notion we encountered previously in our axiomatiza- 
tion of difference measurement in Section 3.3. Suppose (A, X A,, R) is an 
independent strict weak order. Let N be a (finite or infinite) set of 
consecutive integers (positive or negative). The sequence 


{x;: x; € A, i € N} 


is a standard sequence (on the first component) if there are qg,r © A, such 
that ~ gE,r and for alli, i+ 1 EN, 


(x; QE(j+1 7) 


The standard sequence is strictly bounded if there are x and y in A, such 
that xR,x, and x;R, y for alli € N. Similar definitions apply on the second 
component. The idea of a standard sequence is that the difference between 
two successive elements is the same. This follows from the representation 
(5.19), since 


(x DEX407 SACi+) — Aly) = AM - AC). 


The Archimedean condition on the real numbers can be restated to say 
that if we constantly add the same (nonzero) amount, then we eventually 
overstep any bound. Consequently, our Archimedean axiom will state that 
any strictly bounded standard sequence (on either component) is finite. 
This follows from the representation (5.19). For, suppose {x;} is an infinite 
standard sequence (on the first component). Without loss of generality 
take N = the set of all integers. We have already seen that 


Aad — Ail) = AD - Al). 
Thus if n > 0, 
Si(%,) = fil%o) - n[ f(r) = f.(q)]. 


Since ~ gE,r, we have f,(r) — f,(q) #0. If £07) — f,.(¢) > 9, fix any 
x € A,. By the Archimedean condition on the reals, there is a positive 
integer n such that 


Fix) — n[ AC — Ald] < Ai), 


so xR,x,. Similarly, if f,(r) — f,(q) < 0, there is for each x € A, a positive 
integer n such that x,R,x. Thus, {x;} is not strictly bounded. 
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The next two conditions we consider are no longer necessary; that is, 
they do not follow from the representation (5.19). To the author’s knowl- 
edge, the problem of finding conditions on (A, X A,, R) that are both 
necessary and sufficient for the representation (5.19) has not been solved 
in general. It has been solved for the case where A is finite by Scott [1964]; 
Scott’s result is stated in Section 5.4.5. A nonstandard generalization is 
provided by Narens [1974]. 

We say that the binary relation (A, X A>, R) satisfies restricted solvabil- 
ity (on the first component) if, whenever x, y, y © A, and qg,r © A, and 


(¥, r)R(x, DRY, 1), 


then there exists y © A, such that (y, r)E(x, qg). A similar definition holds 
on the second component. The term solvability is usually used for a 
condition which says that certain equations can be solved. A solvability 
axiom on the first component would take the form: given x € A), and q 
and r € A,, there is y such that (x, g)E(y, r); and similarly on the second 
component. We only require solvability under certain restrictions, hence 
the terminology restricted solvability. It is easy to see that even restricted 
solvability is not a necessary condition for the representation (5.19). For, 
let A, = {1, 3} and A, = {6, 7}, define f(a,, a.) = a, + a, and define 


(a), a2) R(b,, b,) <= f(a), a2) > f(b, bp). 


Then (A, X A,, R) is trivially representable, but restricted solvability fails, 
since 


(3, 6)R(1, 7) RC, 6) 


and there is no y € A, such that (y, 6)£(1, 7). 

Given an independent strict weak order (A, X A,, R), we say that the 
ith component (i = 1, 2) is essential if there are x,y © A; such that 
~ xE,y. We shall assume that each component is essential. This is of 
course not necessary. Although essentiality is simply a nontriviality 
assumption, it plays an important role. We shall see in Exer. 19 that if all 
the other assumptions hold and essentiality fails, then our representation 
theorem fails also. 

If (A, R) = (A, X A,, R) is a binary relation, we shall say that it is an 
additive conjoint structure if it satisfies the following conditions: 


Axiom Cl. (A, R) is a strict weak order. 
AXIOM C2. (A, R) satisfies independence. 


Axiom C3. (A, R) satisfies the Thomsen condition. 
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Ax10oM C4. Every strictly bounded standard sequence on either component 
is finite. 


Axiom C5. Restricted solvability holds (on each component). 
AXIOM C6. Each component is essential. 


THEOREM 5.2 (Luce and Tukey). Suppose (A, X A>, R) is an additive 
conjoint structure. Then there exist real-valued functions f, on A, and f, on 
A, such that for all (a,, a,) and (b,, b,) € A, X A 


(4,, 42) R(b,, bz) = f,(a,) + fo(az) > f,(b,) + fo). (5.19) 


Moreover, if f,’ and f,' are two other real-valued functions on A, and A,, 
respectively, with the same property, then there are real numbers a, B, and y, 
with a > 0, such that 


fi = af, + B, fy’ = af, + y.* 


This theorem is proved in Krantz ef al. (1971, p. 275]. We shall omit the 
proof. It is instructive, however, to sketch the proof of a much simpler 
theorem than Theorem 5.2. We do that in the next subsection. 

Before leaving this subsection, let us discuss Axioms Cl] through C6 as 
axioms for preference among multidimensional alternatives. We have 
already discussed in Section 3.1 the assumption that preference is a strict 
weak order. The second axiom, independence, can be questioned. You 
might prefer ginger ale to coffee if you are having only a beverage, but 
prefer coffee and a doughnut to ginger ale and a doughnut. Even with 
market baskets, you can question independence. Suppose the first compo- 
nent is coffee and the second sugar. You might prefer (1, 1) to (0, 1) and 
(0, 0) to (1, 0) if you have a violent dislike of coffee without sugar and so 
would just have to find a place to dispose of it. 

Keeney and Raiffa [1976] give another example. A farmer’s preferences 
for various combinations of sunshine and rain will probably violate inde- 
pendence. For at one level of rain, the farmer might prefer more sunshine, 
whereas at another level of rain, he might prefer less. 

Independence can also be violated in the mental testing situation. For 
example, if subject s, is good at arithmetic and subject s, is good at 
vocabulary, and if item ¢, is an arithmetic item and item ¢, is a vocabulary 
item, we might very well get (s,, ¢,) R(s2, ¢,) but not (s,, 2) R(s2, t). 


*Thus, in particular, each f, is a (regular) interval scale. This uniqueness result does not 
hold for all systems (A, X A2, R) representable in the form (5.19). See Exer. 24. In general, it 
would be interesting to know under what conditions on a representable (A, X A,, R) one can 
obtain this uniqueness result. 
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Response strength 


k,=q Fixed incentive 


k,=r Fixed incentive 


Drive 


d,=x d,=y 


Figure 5.3. Curves of response strength versus drive for different levels of incentive. Cross- 
over interaction violates independence. 


Suppose we plot the first variable A, on the horizontal axis and the 
strength of response (as determined by a measure preserving the relation 
R) on the vertical axis. Let us consider curves of response strength over 
varying a, in A, and fixed a, in A). If these curves cross over, then there is 
a violation of independence. For example, in the drive—incentive situation 
pictured in Fig. 5.3, there is a violation of independence, for 


(x, r)R(x, q) and (y, g)R(, 7). 


The third axiom, the Thomsen condition, is not very intuitive, and so it 
is hard to determine by a “thought experiment” whether or not preference 
on multidimensional alternatives would satisfy it. However, Levelt et al. 
[1972] have done an experimental test of a more general condition than the 
Thomsen condition, in the binaural loudness situation. This more general 
condition is called double cancellation, and it says that for all x, y,z € A, 
and g, r, 5 € A», 


(x, 5) S(z, r) & (z, g)S(y, 5) = (x, g)S(y, 1). 


It is easy to show that double cancellation follows from the representation 
(5.19) and that if (A, R) is strict weak, the Thomsen condition follows from 
double cancellation. In the Levelt et al. experiment, binaural pairs of 
stimuli (0, 7;) were ordered according to loudness, assuming the indepen- 
dence axiom. Then double cancellation was tested on this ordering, and it 
was found to be satisfied by more than 98% of the cases x, y, z and q, r,s. 
Coombs and Komorita [1958] tested the double cancellation axiom using 
preferences among gambles. A more recent test of the double cancellation 
axiom was carried out by Wallsten [1976] in the context of information 
processing under uncertainty. 
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At first glance, the fourth axiom, the Archimedean axiom, seems reason- 
able, at least if A is thought of as a collection of market baskets or 
commodity bundles. However, if A is not a collection of market baskets, 
there is some question about the axiom. In Section 3.2, we presented an 
argument against an Archimedean axiom. The argument was that no 
number of lamps could compensate for a long, healthy life. This argument 
seems to apply here as well. Thus if a is having a long, healthy life and a’ is 
having a short, sickly life, and if n is having n lamps, then (a, 1) might be 
preferred to (a’,n), for all n. This violates a form of the Archimedean 
axiom. Exercise 13 investigates this example further, and asks the reader to 
show that it does not violate the form of the Archimedean axiom we have 
introduced. However, as Exer. 14 points out, a modification of this 
example does violate a variant of our Archimedean axiom. 

A related example is given by David Krantz.* If ¢ is being a lawyer and 
d is being a dishwasher, and if n is receiving a “bribe” of n dollars, then it 
is possible that (?, 0) is preferred to every (d, n); a person might not want 
to be a dishwasher no matter what you bribed him. As usual, it is 
impossible to “verify” an Archimedean axiom by test, since there would 
need to be infinitely many tests made. 

The fifth axiom, restricted solvability, probably is a reasonable one for 
commodity bundles. If (¥, r) is preferred to (x, g), which is preferred to 
(y, r), then some amount y in between y and y, combined with r, should be 
judged equally preferable to (x, g).t For other interpretations, this axiom 
can be questioned. Suppose r and q are vehicle designs, and x is a power 
source. It is quite possible we can find a power source y for vehicle design r 
which makes a system preferable to the system with source x and vehicle 
design q, and another power source y for design r which makes the system 
less preferable to the system (x, q), but for there to be no power source y 
for design r for which (y, r) and (x, q) are judged equally preferable. 

Finally, the last axiom, essentiality, is introduced to avoid trivial situa- 
tions. 

Experimental tests of the conjoint measurement axioms Cl through C6 
have concentrated on the independence axiom and on the double cancella- 
tion axiom (hence the Thomsen condition). As Falmagne [1976] points out, 
most experimental tests of measurement axioms such as those for additive 
conjoint measurement are likely to lead to a few violations. Simply 
counting number of violations does not give us a good feel for fit of a 
model, and little statistical theory is available for making tests of fit. Also, 
some violations may be genuinely indicative of a systematic statistical 


*Seminar on “Testability of Measurement Axioms,” Center for Advanced Study in the 
Behavioral Sciences, Palo Alto, California, January, 1971. 

TA stronger solvability axiom might not hold for commodity bundles. For example, in the 
coffee-sugar example discussed above, there is probably no y so that 


(1, I) E(y,0). 


222 Product Structures 5.4 


departure from the model. What is called for, according to Falmagne, is a 
Statistical or random analogue of a measurement theory. He develops such 
an analogue for conjoint measurement, and uses it to make a preliminary 
test of the double cancellation axiom. The test rejects additivity, but the 
results are too limited in scope to permit a definite conclusion. See 
Falmagne [1979] and Falmagne, Iverson, and Marcovici [1978] for addi- 
tional results. Further work along these lines is in progress, and more is 
certainly needed. 

The additive representation (5.19 or 5.13 through 5.17) can be tested 
directly. In one such test, computer programs are used to “fit” the best 
possible additive representation, and then statistical tests are used to see if 
the data is accounted for by this additive representation. Using this 
procedure in the binaural loudness situation, Levelt et al. [1972] discovered 
that an additive representation fit their data very well. Tversky [1967] used 
a similar procedure in testing additivity of choices under risk. 


5.4.4 Conjoint Measurement under Equal Spacing 
To give the reader a feel for the conjoint measurement representation, 
we State a simpler theorem than Theorem 5.2, and sketch a proof. Suppose 
(A, X A), R) is an independent strict weak order. Define binary relations 
J, (i = 1, 2) on A, by 
a,J,b, <> a,R,b, & (We, € A))(¢,S,a, or 5,S;c; but not both), 
where S; is defined in Eq. (5.24). Note that a,J,5, holds if a, is strictly 
preferred to b, and there is no element c; strictly in between in the strict 
weak order R;,. If a,J,5,, let us call (a, 5,) a J,-interval. We say that the 
structure (A, X A», R) is equally spaced if for all x, y © A, and qg,r € A, 
xJy & qJor => (x, r)E(y, q)- (5.26) 


Condition (5.26) says that the length of any J,-interval is the same as that 
of any J,-interval, and so all J; (i = 1, 2) intervals have the same length. 
These results follow, since (x, r)E(y, g) means that 


A(x) + ACY) = f(y) + AQ), 


so 


A(x) — Aly) = A — AY). 
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We say that (A, X A,, R) is an equally spaced additive conjoint structure 
if it satisfies the following conditions: 


Axiom El. Strict weak order. 
Axiom E2. Independence. 
Axiom E3. Equal spacing. 


THEOREM 5.3. Suppose (A, X A,, R) is an equally spaced additive con- 
joint structure and A, and A, are finite. Then there exist real-valued 
functions f, on A, and f, on A, such that for all (a,, a,) and (b,, b,) belonging 
to A, X Ax 


(4, 42) R(b,, b2) > f,(a,) + f(a.) > f1(0,) + f2(by)- (5.19) 


Moreover, if fj and f; are two other functions on A, and A,, respectively, with 
the same property, then there are real numbers a, B, and y, with a > 0, such 
that 


fi=of, +B, R=aht+y¥. 


We sketch the proof much as do Krantz et al. [1971, pp. 36, 37]. Let A, 
have m elements and let A, have n elements. Assume first that (A,, R,) and 
(A,, R,) are both strict simple orders. Since A, and A, are finite, we may 
list their elements in increasing R,-order. That is, we may suppose that 
A, = {X1, X%- +> %_} With x,,Rix,-)R,...R,x, and that A, = 
{9p Ja ++ +> I} With g,. Rog, Rz..-R2q). We first observe that 


i+j=uk+l > (x, G)E(x%, %)- (5.27) 


The proof is by the equal-spacing assumption (Eq. 5.26) and mathematical 
induction and is left to the reader. Next, note that 


i+ j>k +l > (x, G)R(% 1). (5.28) 


For k + @ =i + u, where j > u. Thus by independence (x;, q)R(x,, 4,), 
and by (5.27), (x;, 4,)E(%, 9). Equation (5.28) follows. Now define ,(x,) 
= i, f,(q) = j. Equation (5.19) follows by (5.27) and (5.28). This completes 
the proof in the case that each R; is strict simple. If each R, is strict weak, 
use its reduction (A?, R*) = (A,/E, R,/E). 

To prove the uniqueness statement, one observes that if g,(x,) = i and 
8(q,) =J, then g), g, satisfy (5.19). Finally, if f,, f. satisfy (5.19) and 
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Si(%1) = 0, f2(¢;) = 92, and f,(x2) = 7, then 
f, = (7 — 9,)(g, — 1) + 9, (5.29) 
and 
fy = (7 — 9,)( 82 — 1) + 9. (5.30) 
Equation (5.29) follows from equal spacing, which implies that 
Ai) = @ — DE AG2) — Ala] + Ae). 


Equation (5.30) follows similarly. Similar equations hold for any other fj, f; 
satisfying (5.19). The uniqueness results follow. 


5.4.5 Scott’s Theorem 


In this section, we present Scott’s [1964] necessary and sufficient axioms 
for additive conjoint measurement in the case where each 4, is finite. We 
again take n = 2. It is convenient to state Scott’s axioms in terms of the 
binary relation S (weak preference) defined by Eq. (5.2): 


aSb = ~ bRa. (5.2) 
The first Scott condition is the following: 
AXIOM SC1: For all a,, b, in A, and ay, b in Ad, 
(4), a2)S(b,, by) or (by, by) S(aj, a2). 


This axiom follows from the stronger Luce-Tukey assumption that (A, R) 
is strict weak. Axiom SCI also clearly follows from the representation 
(5.19). 

The second Scott condition is the following. 


AXIOM SC2: Suppose Xo, X,,.--, X,—, are in A, and yo, Y\, ..., Ye— 1 are 
in A,, and suppose 7 and o are permutations of {0,1,...,k — 1}. If 
(x; y)S (Xaci> Yotiy) 


for alli = 1,2,...,k — 1, then 


(%(0)> Yo(0)) S( Xs Yo)- 


Axiom SC2 is similar to Axiom SD3 in Scott’s axiomatization of difference 
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measurement (Section 3.3). To illustrate Axiom SC2, let us take A, = 
{a, B},A2 = Cs tT}. k =2, Xo = B, x, = a, Yo = *, and y, = f. Suppose 7 
is the identity permutation, the permutation that takes 0 into 0 and | into 
1. Suppose o is the permutation that takes 0 into 1 and I into 0. Axiom 
SC2 says that if (x), ¥)S(x,(1 You)» then (x, Yoo) S(%q Yo). Thus, it says 
that (x,, ¥,)S(x,, ¥o) implies (x, ¥)S(Xo, Yo). In our example, this says that 
(a, ¢)S(a, *) implies (8, ¢)S( B, *). This is a necessary condition for the 
representation (5.19). (It is a special case of independence, Section 5.4.3.) 

To show that Axiom SC2 follows from the representation (5.19), note 
that since 7 and o are permutations, 


k-1 k-1 k-1 
am [A(x +A] 2 fia) + 2 fr) 


k-1 k-1 
2 fia) + 2 f2( Yori) 


os [filxaa) + (Ve) J 


for each x, and y, is listed once and only once in the next-to-last expres- 
sion. Note also that (x,, ¥,)S(xa( Yo) for i = 1, 2,...,  — 1 implies that 


[Aled + AOD] ZZ, [Also + hwo) 
Thus, 


Si(%o) + AC) SAiCa@) + (Veo): 


so 


(x, Yoo) S (Xe y 0): 


THEOREM 5.4 (Scott). Suppose A, and A, are finite sets and R is a binary 
relation on A, X Az. Then Axioms SC1 and SC2 are necessary and 
sufficient for the existence of real-valued functions f, on A, and f, on A, such 
that for all (a,, a,) and (b,, b,) © A, X A,, 


(a,, 2,)R(d,, by) <= f,(a,) + fray) > f(b) + f(b). (5.19) 


As in the case of Scott’s axioms for difference measurement, the 
sufficiency proof in Theorem 5.4 uses a clever variant of the famous 
separating hyperplane theorem. We omit the proof. 

Axiom SCI seems quite reasonable for the case of preference. Axiom 
SC2 also seems reasonable. But it is impossible to test it empirically. 
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Finally, we observe that Axiom SC2 is an infinite bundle of axioms, one 
for each k. We do not get away with using only a finite number of k, since 
the x, and y, are not necessarily distinct. Thus, x9, x,,..., X,_, might all 
be the same element. 


Exercises 


1. Suppose in a mental test, subject s, does better than subject s, on 
both test items, ¢; and ¢,. Suppose subject s, does better on item ¢, than on 
item ¢,, but s, does better on ¢, than on ¢,. Show that the additive 
representation (5.16) fails. Do this 

(a) by assuming the additive representation and reaching a con- 
tradiction; 

(b) by showing that one of the necessary axioms for an additive 
conjoint structure is violated; 

(c) by showing that one of Scott’s axioms is violated. 


2. (a) Show that if A, = A, = Re and R is lexicographic preference on 
A = A, X A), then (A, R) does not satisfy additive conjoint measurement. 
(b) Which axioms for an additive conjoint structure are violated? 


3. (Krantz et al. [1971, pp. 445-446]) Sidowski and Anderson [1967] 
asked subjects to judge the attractiveness of working at certain occupations 
in certain cities. The results (mean rating over subjects) are shown in Table 
5.1. Suppose O is the set of occupations considered and C the set of cities, 
and suppose r(a, 5) is the rating of the alternative having occupation a in 
city b. Then there are functions 0: O — Re and c: C > Re, so that for all 
a,b, € O and a,, b, € C, 


1(a,, a2) > r(b,, bz) <> 0(a,) + c(a2) > 0(b,) + c(b,). 


(a) Verify that such functions are given by 
o(Lawyer) = 4, o(Teacher) = 10, o(Accountant) = 0 
c(A) = 5.9, c(B) = 5.4, c(C) = 4.3, c(D) = 3.0. 
Thus conjoint measurement (or additivity) is satisfied. 
(b) Are there such functions if r(Teacher, A) is changed to 7.4? 
(c) What if r(Teacher, A) is changed to 7.1? 


Table 5.1. Rating r(a, b) of Attractiveness of City— 
Occupation Combinations* 

(The entry 7.3 — means less than 7.3; 

the entry 3.2 + means more than 3.2.) 


City 
Occupation A B Cc D 
Lawyer 73 6.8 5.7 4.4 
Teacher 73 - 6.7 5.3 3.24 
Accountant 5.9 5.4 43 3.2 


*Data from Sidowski and Anderson [1967]. 
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4. (Fishburn [1970, p. 51]) Suppose A, = A, = {1,2,...,”}, and 
suppose 


(4), 4,)R(b,, 62) <= f(a), a2) > f(b, by). 


(a) Show that if f(a,, a.) = a,a,, then (A, X A, R) has an additive 
conjoint representation. 
(b) Show that if f(a), a.) = a, + a, + a,a,, then (A; X A, R) does 
not have an additive conjoint representation. 
(c) For each of the following functions f, does (A, X A,, R) have an 
additive conjoint representation? 
(i) f(a), a2) = ay + aay. 
(ii) f(a), a.) = max{a,, a}. 
(iii) f(a,, a.) = a, — ay. 
(iv) lay, a) = 1/ayay. 
(v) f(a), a) = a,/(a, + a2). 
5. Suppose A, = T is a set of (dry-bulb air) temperatures and A, = H 
is a set of relative humidities. Then the temperature—humidity index (THI) 
is defined as 


THI (t, h) = ¢ — (0.55 — 0.55h)(t — 58), 


where ¢ © T and h € H (Conway [1963]). Suppose R is a binary relation 
on T X H, with aRb interpreted as “is more uncomfortable under a than 
b.” Show that if THI preserves R, then there are no functions t and y on T 
and H, respectively, so that for all 4, t, © T and h,, h, © H, 


(t), Ay) R(ty, hp) = 1(t) + (hy) > (tz) + yCA2). (5.17) 


6. The THI can also be defined (Conway [1963]) as (0.4)(td + tw) + 
15, where ¢d is the dry-bulb air temperature and tw is the wet-bulb air 
temperature. Suppose D is a set of dry-bulb temperatures and W a set of 
wet-bulb temperatures and THI preserves the relation R of Exer. 5. Show 
that there are functions 6 on D and w on W so that for all td,, td, © D and 
tw,, tw, € W, 


(td,, tw) R(td,, tw) = S(td,) + w(tw,) > 6(td,) + w(tw,). 


7. Suppose A, = A, = Re and 
(4, a2) R(b,, 62) = [ a, > b, and a, > by}. 


Which of the axioms for an additive conjoint structure hold? 


8. For each of the following binary relations R on Re X Re, check 
which of the axioms for an additive conjoint structure are satisfied: 
(a) aRb iff max{a,, a,} > max{b,, bp}. 
(b) aRb iff a,a, > b,b,. 
(c) aRb iff a, > d,. 
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9. Does the binary relation (A, X A,, R) defined in Exer. 7 satisfy Scott’s 
Axiom SC2? 

10. Check which of Scott’s axioms are satisfied by the binary relations 
of Exer. 8 if each is considered a relation on 


(1, 25 coat) SOUL 2stana ants 


11. Suppose A, = {a, B}, A, = {*, ft}, and R is the following binary 
relation on A, X A,: 


{<(a, *), (a, T)>, Co, *), CB, *)> (Ca, *), CB 1), 
<(a, t), CB: t)>, CB: *), CB; t)>}- 


Is (A, X A,, R) an equally spaced additive conjoint structure? 

12. Suppose A, = A, = {0,1} and R is lexicographic preference on 
A=A, X A,. Show that (A, R) has an additive conjoint representation 
(5.19), but it is not an equally spaced additive conjoint structure. 


13. The next two exercises investigate an example used to argue against 
the Archimedean axiom for additive conjoint measurement. Suppose A, = 
{a, a’} and A, = N. Suppose 


(4, )R(a, jf) ei >, 
a’, DRA, i> J, 
(a, i)R(a’, j), all i, 7. 
(a) Show directly that every strictly bounded standard sequence on 
either component is finite. 
(b) Verify (a) indirectly by showing that (A, X A,, R) is represent- 
able in the form (5.19). 
14. Suppose A, = {a, a’, b} and A, = N. Suppose on {a, a’} X N, R is 
defined as in Exer. 13. Suppose moreover that 
(6, )R(b, jf) ei >, 
(a, i)R(b, i + 100), all i, 
(b, i)R(a’, j/), all i, J, 
and otherwise R is defined so as to make a strict weak order. Think of b as 
the alternative “having a long, healthy life with one sickness.” 
(a) Show that every strictly bounded standard sequence on either 
component is finite. 
(b) Show that (A, X A,, R) is not representable in the form (5.19). 
(c) Which of the axioms for additive conjoint measurement fail? 
(d) Show that the following statement is true, and hence a variant of 
the Archimedean axiom fails. The sequence 


1, 101, 201, ... 
forms a standard sequence on the second component and the sequence 
(a’, 1), (a’, 101), (a’, 201), .. . 


is strictly bounded and infinite. 
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15. Suppose f,; and f, define an additive conjoint representation. 
(a) Show that the following statements are meaningful: 
(i) f(a, a) > f(b, b). 
Gi) f(a,, a2) is constant. 
(b) Show that the following statement is not meaningful: 


f(a, a2) = 2f(b,, 52). 
16. (Krantz et al. [1971]}) Let 


A, = A, = {2": nis a positive integer}. 


(a,, a4,)R(b,, 62) <= a, + a, > b, + by. 


Show that (4, X A, R) satisfies Axioms Cl through C4 and C6 of 
additive conjoint measurement, but not Axiom CS. (In particular, this 
shows that Axioms Cl through C4 are not sufficient for conjoint measure- 
ment.) 

17. Let A, and A, be as in Exer. 16, and let R on A, X A, be the 
lexicographic ordering. Show that (A, X A, R) satisfies Axioms Cl 
through C4 and C6 of additive conjoint measurement, but not Axiom CS. 


18. Let.A, and A, be as in Exer. 16, and let 
(a,, a,)R(b,, 62) > [ a, 2 b, & a, 2 b, & (a, > 5, or a, > b,)]; 


that is, aRb = a > b. Show that (A, X A,, R) satisfies Axioms C2, C4, C5, 
and C6 of additive conjoint measurement, but not Axioms Cl and C3. 

19. This exercise and the next one are intended to investigate the role of 
essentiality (Axiom C6) in the axioms for an additive conjoint structure. 
Suppose A, = Re X Re, A, = {0}, T is the lexicographic ordering of A,, 
and 


(<a, b>, 0)R{<c, d>, 0) & <a, b> T<c, d>. 


(a) Show that (4, X A,, R) satisfies Axioms Cl through C5, but not 
C6. 
(b) Moreover, show that the representation (5.19) fails to hold. 
20. Suppose we modify the example of the previous exercise to make 
A, = Re X Re, A, = {0,1}, and T the lexicographic ordering of A,. 
Suppose 


<a, Jc, d> & ~ (a, b> Tic, d) & ~ <c, d>TKa, b>. 
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Let 


(<a, b>, i)R(<c, d>, j)  [<a, b>T<c, d> or (<a, b>JXc, d> & i > j)]. 


(a) Show that the representation (5.19) fails to hold. 
(b) Show that essentiality, Axiom C6, now holds. 
(c) Which of the axioms Cl through C5 fail and which hold? 


21. (Krantz et al. {[1971]) Show that the axioms for an additive conjoint 
structure are independent. 


22. Prove the following assertions made in the proof of Theorem 5.3: 
(a) Equation (5.27) holds. 
(b) Equations (5.29) and (5.30) hold. 
(c) The uniqueness results follow from (5.29) and (5.30) for both 
Si Sr and Si tt 

23. (Krantz et al. [1971]) Suppose that (A, X A,, R) satisfies Axioms Cl 
and C4 of additive conjoint measurement and also the following condi- 
tions: 

(i) (Unrestricted) solvability on both components. 
(ii) Double cancellation. 
(iii) At least one component is essential. 
Show that (A, X A, R) is an additive conjoint structure. 

24. Suppose |A,| = 1 and f, and f, satisfy Eq. (5.19). Show that if fj and 
J; also satisfy Eq. (5.19), the uniqueness result of Theorem 5.2 does not 
hold. In particular, show that all monotone increasing transformations of f, 
are admissible. 


25. (Krantz et al. [1971, Section 10.7.2], Luce [1965], Marley [1968], 
Roberts [1974]) Suppose A is a set of physical objects in motion, E is 
kinetic energy, m is mass, and v is velocity. Then E =} mv?. Now m = f, 
and v = f, can be looked at as extensive measures (Section 3.2) by 
introducing a relation >, and an operation 0, on A, = A and similar >, 
and 0, on A, = A. Moreover, if h, =4m and h, = v’, then g, = log h, 
and g, = log h, can be looked at as conjoint measures on A, X A, for an 
appropriate relation R. These conjoint measures and extensive measures 
are related: there are constants y,, 72, a, > 0, a, > 0, a,/a, rational, so 
that 


h, = nh h, = Yo Sy. (5.31) 


In some sense, therefore, the conjoint measures are closely related to the 
extensive measures, and the two ways of measuring kinetic energy are 
consistent. This exercise explores conditions under which this result holds 
in general, and sketches out a proof that these conditions work. 

Suppose (A, X A>, R) is an additive conjoint structure, and suppose 
(A;, >,, 0,) is an extensive structure (Section 3.2) for i = 1, 2. We say that 
a law of exchange holds if there are positive integers m and n such that for 
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all positive integers i and j and for all a in A, and u in A,, 
(ia, j"u)E(j"a, iu). 


For i = 1, 2, suppose g,:A; > Re and f;:A; > Re are such that the 
following conditions hold: 
(i) (a, b)R(c, d) = g,(a) + g,(b) > g,(c) + g,(d). 
(ii) a >; b & f(a) > f,(0). 
(iii) f(a 0, b) = f(a) + f(b). 
Choose f, so that f (a) = 1 for some a. (This can always be done.) Let 
h, = e®. Then 
(iv) (a, b)R(c, d) = hy(a)h2(b) > hy(c)h,{ a). 
(a) Define ¢, on the range of f, by $(f,(a)) = h(a). Show that ¢, is 
strictly increasing. 
(b) Suppose (5.31) holds for some y,, y2, a, > 0, a, > 0, with a,/a, 
rational. Show that a law of exchange holds for some m, n. 
(c) Prove that if a law of exchange holds with positive integers m 
and n, then for all a in A, and u in A,, and positive integers i and /, 


h(i"a)/h (ja) = h,(i"u)/h(j"u). (5.32) 


(d) If i and / are positive integers, let h(i, ) denote the common 
ratio in (5.32). Show that if i, /, k, € are positive integers, then if i/j = k/@, 
it follows that h(i, /) = h(k, ¢). 

(e) Show from the above that we may write A(i, /) as h(i//). 

(f) Observe that h is a function from the positive rationals to the 
reals. Moreover, h is strictly increasing, and for r and s positive rationals, 
h(rs) = h(r)h(s). 

(g) It is easy to extend h to all positive reals by letting 


h(x) = sup{h(y): 0 < y S& x, y positive rational }. 


Show that h(x) = x8, some B > 0. [The proof uses Cauchy’s fourth 
equation (Section 4.2) and the observation (Exer. 12 of Section 4.2) that 
our solution to this holds for monotone increasing functions.] 

(h) If f,(a) = A”, some a in A,, define ,(A) = 9, (A”). Show that if i 
is a positive integer, and d is in the domain of y,, tha so is iA. 

(i) Setting j = 1 in (6.32) and using the result of (g), show that 
hia) = i®h,(a). 

G) Show from the results of (i) and (h) that y,(iA) = iY). 

(k) By assumption, | is in the range of f,, and so A iC) is defined. 
Show that y,(A) = ¥,(1)A%. Conclude that $,(A) = ¥,A'”) = ¥,(DAP/™ 
= 7,A", where a, = B/m. A similar argument shows rey ‘ba(\) = 
Y2A™, a, = B/n. Thus, (5.31) follows with a,/a, rational—in fact, equal 
to n/m. 

(Note: An alternative approach to the relation between conjoint and 
extensive measurement is based on distributive laws. See Luce [1978] and 
Narens and Luce [1976] for a development.) 
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26. Discuss density as a fundamental scale. (In Chapter 2 we discuss it 
as a derived scale.) In particular, if D is “denser than,” discuss axioms for 
the existence of functions M and V such that 


aDb = M(a)V(b) < M(b)V(a). 


27. In Section 2.6, Exer. 12, we mentioned an experiment designed to 
rank-order a collection of stereo speakers. Consider this experiment in the 
light of the conditions for conjoint measurement discussed in this section. 
For example, consider what is being assumed about the parameters in 
order to sum scores over dimensions. 


5.5 Nonadditive Representations 
5.5.1 Decomposability and Polynomial Conjoint Measurement 


The measurement representations (5.12) and (5.19) are based on two 
principles: We can decompose utilities (or other scales) into utilities on 
individual dimensions, and we can then add the results. The idea of 
decomposability has great applicability even in situations where additivity 
does not apply. In general, if A has a product structure i.e., if A equals 


A, X A,X +--+ XA,, and R is a binary relation on A, we say that (A, R) 
is decomposable if there are real-valued functions f, f,...,f, on 
A,, Az,...,A,, Tespectively, and a function 


F:[f,(A1) X fx(A2) X ++ + XSu(An)| > Re 
such that for all a = (a,,4,,...,4,) and b=(b,,5,...,5,)in A, 


aRb <> FL f(a), folda)s - - s Suldy)] > FL f(r), filer), -- «Sy (5n) 
(5.33) 
The function F is called a composition rule. It is frequently assumed that F 
is one-to-one or strictly increasing in each variable. The representation 


(5.33) applies in very general situations. Of course, the additive representa- 
tion arises if F satisfies 


F(X), Xo, -- +5 X%_) = Xy tH XQZH + HX,. (5.34) 


n 


We have already seen (Section 5.4.1) that the representation (5.33) with F 
given in (5.34) is closely related to the representation (5.33) with F given as 
a product, 


F(X js Kos 5.42 3 Xi) Xa (5.35) 


The product composition rule arises in the response strength situation with 
three factors. For example, Hull [1952] has argued that the appropriate 
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representation here is 
(d,, ky, hy) R(d2, ky, hy) <> 8(d,)«(K,)u(h,) > 5(d,)e(k2)u(h2). (5.36) 


On the other hand, Spence [1956] has argued that the more appropriate 
representation is 


(d,, ky, h,)R(d,, k, ha) <= [8(d,) + «(k,)] w(h,) > [8(d,) + (kp) | eC Ay). 
(5.37) 
The composition rule in Spence’s model is the distributive rule 


F(X, Xz, X3) = (x1 + X2)x3. 


This distributive rule, in turn, also arises in the study of perceived risk 
(Coombs and Huang [1970]), where x, is a function of expected regret, x, 
of expected value, and x, of the number of plays in a gamble. See Krantz 
and Tversky [1971] and Krantz et al. [1971] for examples of other composi- 
tion rules. 

We first state some necessary and sufficient conditions for decomposa- 
bility with the function F being one-to-one in each variable and then 
investigate briefly some necessary conditions for various composition rules. 
By the Birkhoff—-Milgram Theorem (Corollary 1 to Theorem 3.4), decom- 
posability implies that (A, R) is a strict weak order and (A*, R*) has a 
countable order-dense subset. Moreover, since F is one-to-one in each 
variable, 


aa , 
F562 4c 5 Mi Pe as a Rad OE OS gS Mahe 2 OG) 


implies that y, = y;. Thus, decomposability with such an F implies that for 
all i and all a,, aj in A, and all 5, c in A, 7 #i, we have 


(by oy «5B j— ysjnBj- p-o+y0q) E(B 4, Bay 24D; 46,5 Bj 4 p-o-yDn) 
Pa (5.38) 


‘ 
(C45 C 25 «665 Cj 19 Un Ci p50 29 Cg) EC 45 C09 065 Cp 19s Cra pr <0 Cad 


where E is the indifference relation defined from R by Eg. (5.3). If 
condition (5.38) holds, we say (A, R) satisfies substitutability. The follow- 
ing theorem is proved in Krantz et al. [1971]. 
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THEOREM 5.5. The product structure (A, R) is decomposable with a func- 
tion F which is one-to-one in each variable if and only if (A, R) is a strict 
weak order, (A*, R*) has a countable order-dense subset, and (A, R) satisfies 
substitutability. 


Proof. Necessity has already been shown. To show sufficiency, note that 
by the Birkhoff—Milgram Theorem (Corollary | to Theorem 3.4), there is a 
real-valued function f on A such that for a, b in A, 


aRb = f(a) > f(b). 
Fixing x, in A,,j = 1,2,...,n, we define f, on A, by 


f(a) = F (Xs XQ, 20 0 y Mis By Mpa pre oes Xn): 
Finally, let F:[f,(A,) X f,(A2) X --: Xf,(A,)] > Re be defined as 


FT f,(a,), f(a), toe ACAI = f(a), a2,...,4,). 


It is easy to verify that F is well-defined and is one-to-one in each variable. 
| 


The composition rules that have been of most interest in the measure- 
ment literature are those where F is a polynomial, a function of the form 


F( Xs Xp 2s X) = E oyx fag a io 


where a, is a real number and the 8, are nonnegative integers. In case F is 
a polynomial, the representation (5.33) is sometimes called polynomial 
conjoint measurement. Krantz et al. [1971, Chapter 7] summarize some 
conditions either necessary or sufficient for a variety of polynomial repre- 
sentations. However, there is still much work to be done in this area. It is 
sometimes of sufficient practical use to specify necessary conditions for a 
polynomial representation. For example, in the response strength applica- 
tion, if one has a choice between the Hullian and Spencian models, one 
has a choice between the two polynomial composition rules (5.36) and 
(5.37). As Krantz and Tversky [1971] pomt out, if one can derive necessary 
conditions for each and show that one of these necessary conditions is 
violated, this eliminates one of the possible models. For example, assuming 
that 6, x, and p all take on only positive values, one necessary condition for 
the representation (5.36) is the following independence condition (known 
as a joint independence condition): , 


(d,, ky, hy) R(d,, ky, hy) => (dy, kz, hy) R(d, kp, hy). (5.39) 


However, this condition is not necessary for the representation (5.37). For, 
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as Krantz and Tversky [1971] point out, suppose we take 6(d,) = 1, 
8(d,) = 2, «(k,) = 4, K(k) = 2, w(h,) = 5, and p(A,) = 4. Then if R is 
defined by (5.37), we have a violation of (5.39): 


(4), ky, hy) Rd, k,, h,) since (1+ 4)5 > (2+ 44, 
~ (dj, kay hy)R(dy, ky, hy) since (1 + 298 (2 + 24. 


For a discussion of similar conditions useful in the testing of alternative 
polynomial conjoint measurement models, see Exer. 5 below and see 
Krantz [1968], Krantz and Tversky [1971], or Krantz et al. [1971], 
Chapter 7]. 


5.5.2 Nondecomposable Representations 


Even if decomposability is not satisfied, one can usefully study numeri- 
cal representations on product structures. For example, in case R is 
preference, Raiffa [1969] and Fishburn [1974] seek real-valued functions f, 
and A, on A, and f, and A, on A, such that for all a,, b, in A, and a,, b, in 
A,, 

(ay, a,)R(d,, b,) > f,(a,) + f(a) + Ay (a,)A,(42) > 


F,(b1) + f(z) + Ay(B,)A2( by). om 

The term A,(a,)A,(a,) represents an interaction effect. The representation 

(5.40) is called the quasi-additive representation, and we shall encounter a 

variant of it in Section 7.3.3, where we shall give some sufficient conditions 

for the existence of a quasi-additive utility function and mention a variety 
of applications. See also Farquhar [1977] for a discussion. 

Tversky [1969] studies the representation where there are real-valued 


functions f,, f,,...,f, on A,, A,,...,A,, respectively, and increasing, 
continuous functions ¢,, $,...,, from Re to Re, so that for all i, 
$;(— 5) 7 —9%,(8), 
and so that 
aRbo = oi{ fai) — f,(b,)| > 0. (5.41) 


The representation (5.41) is called the additive difference model. See Walls- 
ten [1976] for a recent application and Beals, Krantz, and Tversky [1968] 
for an analogous representation if indifference is used in place of prefer- 
ence. If each ¢, is a linear function, $,(6;) = 4,6, for some positive ¢,, then 
the additive difference model implies additive conjoint measurement. For 


2 al Sa) es f(5;) | = Ps [ 4f(4,) ~ ‘f(b |; 
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and we have 
aRb = Xitf)(a,) > Utf)(G). 


The representation (5.41) can be generalized to the representation 


aRbo o{[ fi(ay)—f,(o)) J, [ fa(a2) — f(b) ], tree ACO ACOSO: > 0, 


or the representation 
aRbe r| 2 4a, ») | > 0. 


Such representations are called by Krantz [1968] absolute difference models. 
They have been studied rather extensively by Pfanzagl [1959]. 

Another interesting class of representations of current interest is the 
general class known as fractional hypercube representations, which are 
discussed by Farquhar [1974, 1975, 1976]. Farquhar develops techniques 
for deriving sufficient conditions for a wide variety of these representa- 
tions. Farquhar [1977] surveys a number of other representations of 
current interest. 


Exercises 
1. Suppose there is a function f:A — Re such that for all a, b € A, 


aRb & f(a) > f(b). 


(a) Show that if f(a), a) = a, + a, + aja, then (A, R) is decom- 
posable. 
(b) For each of the following functions f, determine if (A, R) is 
decomposable: 
(i) f(a), a2) = at + aap. 
(ii) f(a), a2) = max{a), a}. 
(iil) f(a), a2) = a, — ap. 
(iv) f(a), 42) = 1/4). 
(v) f(a), a2) = a,/(a, + a2). 
2. Observe that the function THI(?, 4) of Exer. 5, Section 5.4, defines a 
quasi-additive representation that is also a polynomial representation. 


3. Show that in both the Hullian and Spencian models, if the functions 
5, «, and p all take on only positive values, then the following indepen- 
dence condition is satisfied: 


(d,, ky, hy) R( a2, ky, hy) = (dj, kp, Ay) Rd, ky, hy). 


4. Show that F as defined in the proof of Theorem 5.5 is well-defined 
and is one-to-one in each variable. 
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5. (Krantz and Tversky [1971]) (a) The simple polynomials are defined as 
follows: Any monomial x, is a simple polynomial. If two simple polynomi- 
als have no variables in common, their sum and product are simple 
polynomials. Show that, up to permutation of labels, there are only four 
simple polynomials F(x,, x, x3) of three variables: 


X, + x2 + x3 (additive), 

(x, + x3)x3 (distributive), 
X4X_ + X3 (dual distributive), 

X1X2X3 (multiplicative). 


(b) In the following, we shall assume that A = A, X A, X A, and 
that (5.33) holds with all f(x,) > 0. We shall investigate various necessary 
conditions which can be used to differentiate among the simple polynomial 
composition rules of part (a). For a more complete discussion, see Krantz 
[1968], Krantz and Tversky [1971], or Krantz et al. [1971, Chapter 7]. We 
say that A, is independent (of A, and A,) if, for all a,b € A,, p,q € A, 
and u, v € A, 


(a, p, u)R(b, p, u) <> (4, g, v)R(, g, 2). 


(The rule in Exer. 3 exemplifies this.) Independence of A, and of A; is 
defined similarly. These notions are generalizations of the independence 
notions that arose in Section 5.4.3. Show that for any of the F’s of part (a), 
independence holds for each 4,. 

(c) Again generalizing a notion of Section 5.4.3, we say that double 
cancellation holds for A, and A, if 


[(a, 4, u)S(b, r, u) & (5, p, u)S(c, q, u)] = (a, p, u)S(c, r, x), 


where S is defined from R by Eq. (5.2). A similar definition holds for any 
A, and A,. Show that if F is any of the polynomials of part (a), then double 
cancellation holds for any A; and 4,. 

(d) We say that A, and A, satisfy joint independence (from A;) if 


(a, p, u)S(b, q, u) <= (a, p, v) S(b, q, 0). 


Similar definitions hold for joint independence of A; and A ; (from A,). 
Show that joint independence for all pairs i and j implies independence for 
all single factors. 

(e) Show that if F is any of the polynomials of part (a), then some 
pair of factors is jointly independent (from the third). 

(f) Show that if F is the additive polynomial of part (a), then joint 
independence holds for each pair. 

(g) We say that distributive cancellation holds if, whenever the condi- 
tions (a, p, u)S(c, r, v), (6, g, u)S(d, s, v), and (d, r, v)S(b, p, u) all hold, 
then (a, q, u)S(c, s, v). Show that if F is the distributive polynomial of part 
(a), then distributive cancellation holds. 

(h) Show that distributive cancellation also follows if F is the additive 
polynomial of part (a). 
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(@) However, show that distributive cancellation may fail if F is the 
dual distributive polynomial of part (a). 

6. Show that if the quasi-additive representation holds, then (A, R) is a 
strict weak order. 

7. Show that if the additive difference model holds, then (A, R) does 
not have to be a strict weak order; in particular, R does not have to be 
transitive. Thus, the additive difference model is a measurement model for 
preferences which may not be transitive. (Tversky [1969] shows that if 
n 2. 3, and the additive difference model holds, then (A, R) is transitive if 
and only if all the functions ¢; are linear, that is, there are real numbers ¢, 
so that (8) = 4,6, all 5.) 

8. Huang [1975] and Kirk [1977] consider the following representation, 
which they call nonsimple distributive conjoint measurement: 


n-1l n—-1 
aRbo 2, 8i(4)) fi41(4;41) > 2 8i( D1) fis (Bi4 1)- 


Which of the axioms for additive conjoint measurement are necessary for 
this representation? 


5.6 Joint Scales of Individuals and Alternatives 


In this section, we turn to a quite different kind of product structure, 
that where the set of individuals making judgments is one dimension and 
the set of alternatives or objects about which judgments are made is a 
second dimension. This kind of situation arose in the mental testing 
situation, and we shall see a variety of other situations in which it arises. 
We present this material mostly to illustrate problems of measurement 
quite different from the preservation of ordinal preference data. 


5.6.1 Guttman Scales 


Suppose S is a set of individuals whose reactions or experiences are 
being studied and E is a set of reactions or experiences. Let aRb mean that 
individual a had (or experienced) reaction or experience b. Then R defines 
a binary relation on S U E; in particular, R is a subset of S * E. Ina 
classical experiment, Stouffer et al. [1950] studied fear symptoms of United 
States soldiers during World War II. The set S here is the group of soldiers 
being studied, and the set E consists of certain fear reactions such as 
violent pounding of the heart, shaking or trembling all over, or losing 
control of the bowels. The experimenters found that it was possible to 
order the reactions such that if a soldier experienced a reaction, he (tended 
to) experience all reactions coming before it in the order. Thus, it was 
possible to simultaneously order the individuals and reactions in such a 
way that individual a had reaction b if and only if a followed b in the 


5.6 Joint Scales of Individuals and Alternatives 239 


ordering. This joint ordering of individuals and reactions suggests that there 
is a natural ordering of the fear reactions, from least to most severe. In 
terms of a representation, we can think of finding two real-valued func- 
tions, s on S and e on E, such that for all a in S and b in E, 


aRb = s(a) > e(b). (5.42) 


The two functions s and e satisfying Eq. (5.42) define what is called a 
Guttman scale, after Louis Guttman [1944]. Notice that we are not seeking 
a homomorphism from one relational system into another. However, 
obtaining functions s and e satisfying Eq. (5.42) can be thought of as a 
measurement problem, and we can ask for a representation theorem. 

In general, given a triple (S, E, R), with R a subset of S < E, we ask 
whether (S, E, R) possesses a Guttman scale. This question arises in a 
variety of contexts other than the one we have discussed. For example, if S 
is a set of individuals and E is a set of statements, the relation aRb can be 
taken to mean that individual a agrees with statement b. Then the 
existence of a Guttman scale implies that the statements have a certain 
natural ordering vis-a-vis the subjects. If S is a set of individuals and E a 
set of test items, and aRb means that individual a answers test item 5 
correctly, then a Guttman scale leads to a natural joint ordering of items as 
to difficulty vis-a-vis the level of skill of individuals, with a subject 
answering an item correctly if and only if his skill level is above the level of 
difficulty of the item. 

It is clear that for a Guttman scale to exist, it is not possible to have 


aRb and ~a’Rb while a‘Rb’ and ~aRb’. (5.43) 


If condition (5.43) fails for all a,a’ € S and b, b’ € E, we say that 
(S, E, R) is consistent. 


THEOREM 5.6. If S and E are finite sets and R&S X E, then (S, E, R) 
possesses a Guttman scale if and only if (S, E, R) is consistent. 


Proof. Omitted. See, for example, Ducamp and Falmagne [1969]. 

We shall generalize the notion of Guttman scale in Exers. 32 and 33, 
Section 6.1. For recent results on existence of Guttman scales, see Leibo- 
witz [1978]. 

5.6.2 Unfolding 
Let us again consider the situation where S is a set of subjects and E is a 


set of experiences, reactions, statements, or alternatives—let us think of 
statements to be concrete. We imagine that the statements can be 


240 Product Structures 5.6 


measured (for example, on a scale of degree of conservatism)—let e(b) be 
the measure of statement b. We imagine that each individual a associates 
an ideal value or degree on the scale (for example ideal degree of 
conservatism)—let s(a) be this ideal degree. Then we can certainly imagine 
the individual a as preferring statement x to statement y, or agreeing more 
with statement x than with statement y, if and only if statement x is closer 
to a’s ideal than is statement y, that is, if and only if 


|s(a) — e(x)] < |s(a) — e(y)]- 


The functions s and e are sometimes said to define a joint scale of 
individuals and statements. Suppose we let xR,y mean that individual a 
agrees more with statement x than with statement y. Then we have for all a 
in S and x, y in E, 


xR,y = |s(a) — e(x)| < |s(a) — e(y)I.- (5.44) 


For each a, it follows from (5.44) that the binary relation R, defines a strict 
weak order: an ordinal utility representation can be obtained by “folding” 
the real line at s(a). 

Conversely, suppose each individual a gives us his preferences among 
statements, with xR,y having the interpretation that a agrees more with x 
than with y. We ask: Are there real-valued functions s on S and e on E 
satisfying Eq. (5.44)? If so, we say that s and e define a Coombs scale, after 
Clyde Coombs [1950]. As we observed above, in order for a Coombs scale 
to exist, the individual preference relations R, must all be strict weak 
orders. The Coombs scale can be thought of as a joint “unfolding” of the 
individual preference orderings. 

The representation (5.44) can be generalized in a natural way. Namely, 
this representation assumes that we can measure distance |s(a) — e(x)|, but 
uses a very specific distance measure. Other metrics can be used in place of 
this. In the most general situation, we can think of a metric space (X, d) 
and functions s:$ — X and e:E — X such that for all a in S and x, y in E, 


xR,y = d[ s(a), e(x)] < d[s(a), e(y)]. (5.45) 


In particular, if (X, d) is a higher-dimensional Euclidean space, we speak 
of multidimensional unfolding. Multidimensional unfolding has been studied 
by Bennett and Hays [1960] and Suppes and Zinnes [1963]. See also 
Coombs [1964, Chapter 7]. 

Not much progress has been made on an axiom system for the repre- 
sentation (5.44), let alone its generalizations. We shall present some results 
for (5.44) in the exercises. 

It should be noted that neither Guttman scales nor Coombs scales are 
usually obtainable if the representation is asked to hold “exactly.” The 
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tepresentations can usually be obtained only “approximately” at best, and 
the emphasis in the literature is often to assume that the representation 
holds and to find the “best-fitting” functions s and e. 


Exercises 


1. A triple (S, E, R) with R & S X E can be represented by a matrix of 
0’s and 1’s whose rows correspond to elements of S and whose columns 
correspond to elements of E, with the i, 7 entry equal to 1 if and only if iR/. 
What property of this matrix after a permutation of rows and columns 
corresponds to existence of a Guttman scale? 

2. Suppose S = {1, 2, 3} and E = {a, B, y}. 

(a) Show that if R = {(1, a), (2, 8), G, y)}, there is no Guttman 
scale. 
(b) Check if there is a Guttman scale in the following cases: 
(i) R = {(1, a), (2, a), (3, a)}. 
(ii) R = {(1, a), (1, B), 2, a), (3, @), (3, B)}. 

3. Suppose that S = {1, 2, 3}, E = {a, B,y}, and R,, R,, R; are the 

following rankings (strict simple orders): 
R,: a over B over y, 
R,: B over y over a, 
R;:  y over a over B. 

Show that there is no Coombs scale. 


4. If S and E are as in Exer. 3, determine whether there are Coombs 
scales in the following cases: 
(a) R,: a over B over y, 
R,: £ over y over a, 
R;: B over y over a. 
(b) R,: a over B over y, 
R,: y over B over a, 
R;: a over y over B. 

5. (Ducamp and Falmagne [1969], Ore [1962]) Imagine a set S of 
patients and a set E of symptoms. Let aRb hold if and only if patient a has 
symptom b. The problem is to assign to each patient and to each symptom 
a disease such that a patient has all the symptoms of his disease, and only 
these symptoms. Put another way, find functions s:S — Re and e:E —> Re 
such that for alla € Sandbe E£, 


aRb © s(a) = e(b). (5.46) 


Show that if S and E are finite and R & S x E, then there are functions s 
and e satisfying Eq. (5.46) if and only if for all a, a’ € S and b, b’ € E, 


(aRb & a’Rb & a'Rb’) = aRb’. 


6. Suppose R is a binary relation on A X A. Krantz et al. [1971, Section 
4.1.2] give sufficient conditions for the existence of a function f:A — Re 
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such that for all a, x, y € A, 


(a, x)R(a, y) = | f(a) — f(x) > If(@) — FO). (5.47) 


This representation is related to the representation (5.44) by taking S = E 
= A and defining R, on E = A by 


xR,y = (a, y)R(a, x). 


Krantz et al. [1971, Section 4.10] also study the following related 
representation: 


(a, b)R(c, d) > | f(a) — f()| > |F(c) — F(@)I. (5.48) 


The relation R can be derived on the basis of judged proximity, or on the 
basis of the response “a and b are further apart than c and d.” The first 
representation theorem for the representation (5.48) was given by Hdélder 
[1901]. More modern axiomatizations were first given by Suppes and 
Winet [1955] and by Tversky and Krantz [1970]. If (X, d) is a metric space, 
and f is a function from A into X, then the following representation 
generalizes (5.48): 


(a, b)R(c, d) = d(a, b) > d(c, a). (5.49) 


The representation (5.49), when used with higher dimensional Euclidean 
metrics, is at the foundation of multidimensional scaling in psychology. It 
has been studied theoretically by Beals, Krantz, and Tversky [1968]. 
Beginning with the work of Shepard [1962a, b] and Kruskal [1964a, b], a 
large number of computer programs have been developed to fit data to 
representations of the form (5.49). 

Exercises 6 through 8 study the representation (5.47). 

(a) Show that the following condition of negative transitivity is 
necessary for the representation (5.47): 


[~ (6, d)R(a, d) & ~ (c, d)R(b, d)] > ~ (c, d)R(a, d). 


(b) Consider the necessity of the following conditions as well: 
(i) [~ (5, c)R(a, c) & ~ (a, b)R(c, b)j — (8, a)R(c, a). 
(ii) [~ (6, d)R(a,d) & ~ (c, b)R(d, b) & ~(a,c)R(b, c)) > 
~(c, a)R(d, a). 
7. Given (A X A, R), we define a ternary relation of betweenness B on 
A as follows: 


B(a, 5, c) > [~ (5, c)R(a, c) & ~ (b, a)R(c, a)]. 


Which of the following conditions on the betweenness relation, assumed 
by Krantz et al., are necessary for the representation (5.47)? 

(a) If bc, then B(a, b,c) and B(b, c,d) imply Ba, b, c), 
B(a, c, d), B(a, b, d), and B(b, c, a). 
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(b) If Ba, b, c) and B(a, c, d), then B(a, b, d) and B(b, c, d). 


8. If R is a binary relation on A X A anda, b,c € A, a #b, we say c is 
a midpoint of a and b if ~ (c, a)R(c, b) and ~ (c, b)R(c, a). Show that it is 
possible for there to be a function f satisfying Eq. (5.47), but for there to be 
some a ¥ b in A such that there is no midpoint c of a and b. 
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CHAPTER 6 


Nontransitive Indifference, 
Probabilistic Consistency, and 
Measurement without Numbers 


6.1 Semiorders and Interval Orders 
6.1.1 Nontransitivity of Indifference 


In Section 3.1, we gave examples where R is preference and there is no 
homomorphism from (A, R) into (Re, >). In this chapter, we give some 
additional examples and then we ask whether or not measurement is still 
possible if such a homomorphism does not exist. We are led in the process 
to consider several unorthodox examples of measurement, including 
measurement without numbers, and measurement that starts with probabil- 
ities or proportions instead of relations. The results have applications 
beyond preference, in particular to measurement of psychophysical quanti- 
ties such as loudness, which we studied in Chapter 4. 

If R is (strict) preference on a set A, then indifference corresponds to the 
binary relation J on A defined by 


alb = ~ aRb & ~ bRa. (6.1) 


That is, you are indifferent between a and b if and only if you prefer 
neither a to b nor 6 to a*. Suppose there is an ordinal utility function f on 
A, that is, a function f:A —> Re satisfying 


aRb = f(a) > f(b). (6.2) 


*In previous chapters, we used the notation E for this relation. Here we use J, for reasons 
to be explained below. 
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If f exists, then 


alb = f(a) = f(b). (6.3) 


Equation (6.3) implies that J is transitive, for aJb and bic imply f(a) = f(b) 
and f(b) = f(c), whence f(a) = f(c), so alc. 

The economist Armstrong [1939, 1948, 1950, 1951] was one of the first to 
argue that indifference is not necessarily transitive.t (Menger [1951] claims 
that attacks on the transitivity of indifference go back as far as Poincaré in 
the nineteenth century.) Luce [1956] suggests as one argument against the 
transitivity of indifference the following. Most people would prefer a cup 
of coffee with one spoon of sugar to a cup with five spoons. But if sugar 
were added to the first cup at the rate of 1/100 of a gram, they would 
almost certainly be indifferent between successive cups. If indifference 
were transitive, they would have to be indifferent between the cup with one 
spoon and the cup with five spoons. Similarly if preference between air 
environments is determined on the basis of eye irritation, then you prob- 
ably prefer an air environment with .05 parts per million (ppm) of ozone to 
one with .5 ppm. But you remain indifferent if ozone is added in amounts 
of 10~'° ppm at a time. To give a related example, in the judgments that 
one sound is louder than another, we might easily find three sounds a, b, 
and c such that a and b as well as b and c are judged equally loud, because 
they are sufficiently close, while a and c are sufficiently far apart so we can 
recognize one as louder. Thus, the attack on transitivity of the relation / 
extends beyond the case of preference. 

A considerably different example is the following. Suppose you are 
indifferent between two alternative plans for government support of the 
arts, plans a and b, where plan a would allocate a budget of 200 million 
dollars to a federal Institute for the Arts and plan 5 would allocate 200 
million dollars to various state institutes. It seems likely that you would 
still be indifferent between plan a and plan b’, which would allocate 200 
million and one dollars to the state institutes. For probably if you have any 
preference when budgets are so close it will be based on a choice of a 
particular approach to support of the arts (federal versus state). On the 
other hand, if you want to see the government spend money on the arts, 
you would certainly prefer b’ to b, which violates transitivity of indif- 
ference. 

A famous related example in utility theory, due to Armstrong [1939], is 
the following. Suppose a boy is indifferent between receiving as a gift a 
pony or a bicycle. He will undoubtedly prefer the bicycle if a bell is 
added to the bicycle without the bell. But he is still likely to be indifferent 


+tHence, indifference is not an equivalence relation, which is one reason why we choose to 
use J rather than £ for indifference in this chapter. 
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between the bicycle with bell and the pony. Hence, indifference is not 
transitive. 

Other arguments against the transitivity of indifference and many refer- 
ences to the literature of this issue can be found in Fishburn [1970a] and in 
Krantz et al. [to appear].* 

The problem of nontransitivity of indifference led Luce [1956] to slightly 
modify the demands in the measurement of preference. Motivated by 
examples like the first two, and the notion of threshold in psychophysics, 
he suggested that we seek a real valued function f on A so that for all a, 
b € A, a is preferred to b if and only if f(a) is not only larger than f(b) but 
“sufficiently larger” so that we can tell a and b apart. To formalize this 
representation problem, we fix a positive number 6, the threshold, and ask 
for conditions on the relational system (A, R) necessary and sufficient for 
the existence of a real-valued function f on A such that for all a, b € A, 


aRb & f(a) > f(b) + 6. (6.4) 


This representation is obviously of interest for judgments of relative 
loudness as well as for judgments of preference. To formulate this repre- 
sentation in the measurement-theoretic terms of Section 2.1, we define a 
binary relation >; on Re by 


x>syex>yr. 


Then we ask for a homomorphism from (A, R) into (Re, >5).t 

We shall restrict our discussion to the case where A is a finite set, in 
which case conditions on (A, R) necessary and sufficient for the repre- 
sentation (6.4) can be explicitly stated. We should note that this repre- 
sentation, which is designed to account for examples like the cups of 
coffee, the comparison of air environments, and the comparison of loud- 
ness, does not account for examples like the alternative budgets and the 
pony-bicycle. For example, if a’ is the plan of budgeting 200 million and 
one dollars to the federal government, you probably prefer a’ to a, but are 


*Kramer [1968] argues that the nontransitivity of indifference may be due to the 
organism’s limited capacity. Computer scientists have made analogous statements, to the 
effect that the relation of equality between numbers is nontransitive for a computer due to 
round off error, resulting from limited memory, speed, or available time. (See Hamming 
[1965] and Rothstein [1965].) These points were made by Professor Jacob Marschak in a 
Western Management Science Colloquium at U.C.L.A. in 1971. The limited capacity argu- 
ment might explain the preferences for cups of coffee and for air environments, but it is not 
clear that it explains the alternative budgets or the pony—bicycle example. 

Other representations for preference using the notion of threshold are considered in 
Fishburn [1970a, b], Krantz et al. [to appear], Luce [1956], and Roberts [1969a, 1971b]. In 
these treatments, the threshold is often allowed to vary from place to place rather than 
remaining constant as it does here. 
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indifferent between a’ and b and between a’ and b’. Thus, your preference 
relation R on the set of plans {a, a’, b,b’} is probably the relation 
{(a’, a), (b’, b)}. But this relation cannot be represented in the form (6.4). 
For if it could, then we would have 


f(a’) > fla) + 8 2 f(b’) > f(b) + 4, 


whence a‘Rb. We return to the alternative budgets example in Section 
6.2.3, where we discuss the condition of strong stochastic transitivity. 


6.1.2 The Scott-Suppes Theorem 


Conditions on (A, R) necessary and sufficient for the representation 
(6.4) are embodied in the concept of a semiorder, a concept introduced by 
Luce [1956]. Our definition of semiorder is formulated following that of 
Scott and Suppes [1958]. The binary relation (A, R) is called a semiorder if, 
for all a, b, c, d € A, the following axioms are satisfied: 


AXIOM S1. ~ aRa. 
AxI0M S2. aRb & cRd = [aRd or cRb}. 
AxIOM S3. a@aRb & bRc => [aRd or dRc}. 


To explam the axioms, and to see that they follow from the representa- 
tion (6.4), let us first note that Axiom S1 says that (A, R) is irreflexive, 
which follows since f(a) can never be larger than f(a) + 6. To see that 
Axiom $3 holds, suppose aRb and bRc. Then f(a), f(b), and f(c) have 
positions like those in Fig. 6.1. Now f(d) 2 f(b) implies dRc, and f(d) 
S f(b) implies aRd. To see that Axiom S2 holds, consider two cases: 
J(a) 2 fle) and f(c) 2 f(a). In the first case, we have 


f(a) 2 fc) > f(d) + 8, 
so aRd. In the second case, we have 

Sle) 2 fla) > f(b) + 8, 
so cRb. 


>6 >s 
a ee ee 


° ° ° 
f(c) f(b) f(@) 


Figure 6.1. Axiom S3 of the definition of a semiorder is a necessary condition. 
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In the budgetary example where A = {a, a’, b, b‘} and R= 
{(a’, a), (b’, b)}, we see clearly that Axiom S2 is violated so that the 
preference relation is not a semiorder. 

It should be remarked in passing that every semiorder is a strict partial 
order (Section 1.6). To prove this, we note that transitivity follows from 
Axiom S1 by taking d = c in Axiom S83. 

We shall prove the following theorem, due to Scott and Suppes [1958]. 


THEOREM 6.1 (Scott and Suppes). Suppose R is a binary relation on a 
finite set A and 8 is a positive number. Then (A, R) is a semiorder if and only 
if there is a real-valued function f on A such that for all a, b € A, 


aRb © f(a) > f(b) + 6. (6.4) 


Coro..ary. If a binary relation R on a finite set A is representable in the 
form (6.4) for some positive number 5, then it is representable in the form 
(6.4) for any positive number 5. In particular, it is representable in the form 


aRb = f(a) > f(b) + 1. (6.5) 


Proof of Corollary. If f satisfies (6.4), then (6’/5)f satisfies (6.4) with 6’ 
in place of 6. 


We defer a proof of Theorem 6.1 until Section 6.1.7. Note that finiteness 
is not needed for the necessity of the semiorder axioms, only for their 
sufficiency. 

Before closing this section, we give an example. Let 


A = {w, x,y,z, a, B, y} 
and let 


R = {(w, x), (w, y), (w, 2), (w, «), (w, B), (w, 7), (2, &), 
(x, B), (x, Y), (y, B), (y, Y), (z, Y), (a, y)}- 


Then (A, R) is not a strict weak order, for ~ xRz and ~ zRa, but xRa, 
which violates negative transitivity. Thus, there is no homomorphism from 
(A, R) into (Re, >). But there is a function f on A satisfying Eq. (6.4), for 
(A, R) is a semiorder. Axiom S1 for semiorders is straightforward. Axioms 
S2 and S3 are tedious to check by hand. For example, we note that xRa 
and zRy. Thus, Axiom S2 requires that xRy or zRa holds. The former is 
the case. Similar checks must be made for many cases to verify Axiom S82. 
Similarly, we note that xRa and aRy. Using d = z, we note that, by Axiom 
S3, either xRz or zRy must hold. We have zRy. Similar checks must be 
made case by case to verify Axiom S3. 


(6.6) 
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An easier way to check that (A, R) is a semiorder is to find a function f 
satisfying Eq. (6.4). If 6 = 1, such a function is given for our example by 


S(w) = 5, I(x) = 3, S(y) = 2.7, f(z) = 2.5, 
f(a) = 1.9, f(B) = 1.6, fly) = 8. 


It is left to the reader to check that f satisfies Eq. (6.4). 
6.1.3 Uniqueness 


The representation (6.4), that is, the representation 
Y= (A, R) > B= (Re, >s), 


is our first example of an irregular representation. To see this, let A = 
{x,y,z} and let R = {(x, z), (x, y)}. Then two functions satisfying Eq. 
(6.4) with 6 = 1 are given by 


f(x) =2, f(y) = 9, f(z) = 0 (6.7) 


and 


g(x) =2, g(y)=.1, g(z) =0. (6.8) 


By Theorem 2.1, (2, 8, f) is not a regular scale. (We have already encoun- 
tered this example in Section 2.2.) A uniqueness theorem that specifies a 
class of admissible transformations would not be helpful here, since this 
class could differ from homomorphism to homomorphism. Thus the theory 
of scale type discussed in Section 2.3 does not apply to all semiorders.* 
The theory of meaningfulness does apply, however, if we use the more 
general definition that a statement is meaningful if its truth or falsity is 
unchanged when scales in the statement are replaced by other acceptable 
scales. In this sense, in the case of a homomorphism f from (A, R) into 
(Re, >s;), the statement 


f(a) > f(b) 


is not meaningful. For if a is y and b is z, this is not true for the 
homomorphism f of Eq. (6.7), but is true for the homomorphism g of Eq. 


*However, if & is a semiorder, then according to the discussion of Section 2.2.2, there is an 
isomorphism F from the reduction A* of A into 8 = (Re, >). (Hence, U* is a semiorder.) 
The Corollary to Theorem 2.1 implies that (A*, 8, F) is regular. The class of admissible 
transformations of such a system (H*, 8, F) does not (usually) take any of the standard forms 
(such as similarity transformations), and its characterization in general is an open problem. 
However, Manders [1977] has recently obtained some interesting results on this problem. 
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(6.8). It is meaningful, however, to assert that 
f(a) > f(b) + 6, 


that is, to assert that f(a) is “sufficiently larger” than f(b). We return to the 
uniqueness question for semiorders in Section 6.1.6. 


6.1.4 Interval Orders and Measurement without Numbers 


For another insight into the representation (6.4), let us consider the 
interval 


J(a) =[ f(a) — 8/2, fla) + 6/2). 
If J and J’ are two real intervals, we shall say that 
J>J’ iff a>b forall aGJ and bE". 
If J >J’, we say that J strictly follows J’. If f satisfies (6.4), then 
aRb = J(a)>J(6). (6.9) 


We may think of J(a) as a range of fuzziness about a, or a range of 
possible values. For example, if we are estimating the monetary value of a 
particular product (as is done in one popular television program), J(a) 
could be a range of estimates. The model of behavior embodied in Eq. 
(6.9) says that we prefer a to b if and only if we are sure that every possible 
value of a is larger than every possible value of b. If (A, R) is a semiorder, 
then all the intervals /(a) have the same length. But it is interesting to 
think of the possibility of letting them have different lengths. Certainly in 
the case of estimating the monetary values of different products, we want 
to allow the ranges of values to be of different lengths for different 
products. We now ask: When does there exist an assignment to each a in A 
of an interval J(a) so that for all a, b in A, (6.9) is satisfied? That is, under 
what circumstances is a person acting (at least) as if he satisfies the model 
(6.9)? If we take a more general point of view than we did in Section 2.1, 
then the assignment of intervals satisfying Eq. (6.9) is as legitimate a form 
of measurement as the assignment of numbers satisfying the representation 


aRb = f(a) > f(b). 


For, one of the goals of measurement is to reflect empirical relations by 
well-known relations on mathematical objects. Having translated an em- 
pirical relational system into what we shall loosely call a mathematical 
relational system, we can apply the whole collection of mathematical tools 
at our disposal to better understand the mathematical system and hence 
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J(z) Jy) J (x) 


Figure 6.2. An interval representation for the interval order (A, R), where A = {x, y, z, w} 
and R = {(x, y), (y, 2); (x, z)}. Intervals are displaced vertically for ease of comparison. 


the empirical one. In particular, we can apply our mathematical tools to 
help in decisionmaking. In this broad sense, assignment of vectors, sets, 
intervals, geometric objects, etc., is a perfectly legitimate form of measure- 
ment if a representation theorem stating a homomorphism from an empiri- 
cal relational system to a mathematical relational system can be proved. A 
similar point of view is expressed in Krantz [1968] and in Coombs, peu 
and Thrall [1954]. 

Having expressed this point of view, let us state a representation theo- 
rem for the representation (6.9). A binary relation (A, R) is called an 
interval order if it satisfies the first two axioms in the definition of a 
semiorder. Clearly, every semiorder is an interval order. But it is not too 
hard to give an example of an interval order that is not a semiorder. Let 
A = {x,y, z,w} and define R on A by 


R= {(x,¥), (y, z), (x, z)}. 


An interval representation satisfying Eq. (6.9) for (A, R) is shown in Fig. 
6.2. But (A, R) is not a semiorder, since xRy and yRz, but ~ xRw and 
~ wRz. 

We now have the following representation theorem. 


THEOREM 6.2 (Fishburn [1970 b,c]). Suppose (A, R) is a binary relation 
on a finite set A. Then (A, R) is an interval order if and only if there is an 
assignment of an interval J(a) to each a in A so that for all a, b in A, 


aRb & J(a)>J(b). (6.9) 


To the best of the author’s knowledge, there has not been much work 
done on the uniqueness of this representation. However, Greenough and 
Bogart [1979] define the length of an interval order and show that an 
interval order (without duplicated holdings*) of length n has a unique 
representation as a collection of intervals using no more than n points as 
end points. W.T. Trotter (personal communication) has also obtained some 
recent results. Uniqueness questions whose answer would be of particular 
interest for applications revolve around the question of how overlapping 


*See Greenough and Bogart for a definition. 
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intervals overlap—in particular, when must one be contained inside 
another? 

Let us comment briefly on why these kinds of questions are of interest. 
In many problems in the social sciences, we wish to put some objects into 
serial order. For example, in political science we wish to list candidates 
ranging from liberal to conservative. In psychology, we wish to place 
individuals in order of stages of development. In archaeology, we wish to 
place artifacts in chronological order. The general problem of seriation has 
close connections with interval orders (and with the interval graphs studied 
in Exers. 25, 26, and 30). Let us discuss the seriation problem in the 
context of sequence dating in arcliaeology. For references on this subject, 
see Kendall [1963, 1969a,b, 1971a,b,c]. Each (type of) artifact in a collec- 
tion of interest was in use over a certain period (interval) of time. Suppose 
we know for two artifacts, a and 6, whether or not the time period of a 
strictly followed that of b. By Theorem 6.2, an assignment of a time 
interval J(a) to each artifact a which preserves the observed relation of 
strict following exists if and only if this observed relation defines an 
interval order. If such an assignment exists, we can still get relationships 
among the time intervals wrong. For example, we might in reality have the 
time interval for x beginning after that for y and ending before that for y. 
However, an assignment J satisfying Eq. (6.9) might not have this prop- 
erty. It is only required that J(x) and J(y) overlap. It would be helpful to 
know under what circumstances two different interval assignments J and 
J’ satisfying Eq. (6.9) can have the property that J(x) is contained in J(y) 
while J’(x) is not contained in J’(y), and under what circumstances this 
cannot happen. For a further discussion of seriation problems and their 
connection with interval orders and interval graphs, see Coombs and Smith 
[1973], Hubert [1974], Roberts [1976, Section 3.4] or Roberts [1978b, 
Sections 3.3 and 4.2]. 


6.1.5 Compatibility between a Weak Order and a Semiorder 


Returning to the Scott-Suppes Theorem, we shall show that the theorem 
is false without the assumption that A is finite. Then, we shall see how to 
generalize the theorem. Our results will be useful in proving the Scott— 
Suppes Theorem, and they will be applied in our discussion of probabilistic 
consistency in Section 6.2. 

To show that the Scott-Suppes Theorem is false without the assumption 
that A is finite, let N be the set of positive integers and let a be any 
element not in N. Take A to be N U {a} and define R on A by 


aRb@a>b+1 for abEN, 
tee for all a EN. 6:10) 
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It is not hard to verify that (A, R) is a semiorder. But there is no function 
J:A — Re satisfying Eq. (6.4). For suppose such a function f exists. Note 
that since 2RO, f(2) > f(0) + 6. By induction, f(2n) > f(0) + nd. Now 
aR2n for all n, so f(a) > f(2n) + &§ > f() + (n + 198. Thus, f(a) is larger 
than every positive number, which is impossible. 

To see how to generalize the Scott-Suppes Theorem, let us suppose for 
the moment that a function f satisfying Eq. (6.4) exists. Then we define a 
binary relation W on A by 


aWb = f(a) 2 f(b). (6.11) 


By Corollary 2 to Theorem 3.4, (A, W) is a weak order. It corresponds to 
the weak order on the reals “weakly to the right of” and gives us the 
relative order of the f values, if not the specific values. Now before we have 
calculated the function f, we do not know what W is. But we shall be able 
to uncover an appropriate W by defining it explicitly in terms of R. 
Namely, we take 


aWb = (Wc)[(bRe = aRc) & (cRa > cRb)]. (6.12) 


If W is the relation of Eq. (6.11), then certainly it satisfies the implication 
= of Eq. (6.12). For 


aWb & bRe = [ f(a) 2 f(b)] & [ f(b) > fc) + 8] 
= fla) > fc) + 6 


=> aRc. 
Similarly, 
aWb & cRa => cRb. 


However, W does not necessarily satisfy the implication = of Eq. (6.12). 

To illustrate Eq. (6.12), let us consider the example A = 
{w, x, y, z, a, B, y} with R defined by Eq. (6.6). Then xWz holds. For 
zRc => xRc. (There is only one case to check: zRy holds and also xRy.) 
Similarly, nothing is in the relation R to x, so of course cRx => cRz. 
Similarly, xWx, xWy, zWa, etc. W is the weak order which ranks w largest, 
then x, then y, then z, then a, then £, and then y. W here is a simple order. 
In general, W is only a weak order, as we shall prove shortly. To give a 
second example, if A = N U {a} and R is defined by Eq. (6.10), then the 
weak order W is given by the order > on N with aWa for all ain N. W is 
again simple. 


Lemma 6.3. If (A, R) is a semiorder and W is defined by Eq. (6.12), then 
(A, W) is a weak order. 
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Proof. (A, W) is a weak order if and only if it is transitive and strongly 
complete. To verify transitivity, suppose xWy and yWz. To show xWz, 
choose c in A and show 


2Re => xRce 


and 


cRx => cRz. 


If zRc, then since yWz, yRc follows. Since xWy, yRc implies xRc. Thus, 
zRe => xRc. A similar proof shows that cRx = cRz. So far, the semiorder 
axioms have not been used. Proof of strong completeness proceeds by 
cases and uses the semiorder axioms. Details are left to the reader. | 


The binary relation W defined by Eq. (6.12) is called the weak order 
associated with R. Where several semiorders R exist, it will be convenient to 
denote the associated weak orders by W(R). The definition of W(R) is 
originally due to Luce [1956], and in the present form is due to Scott and 
Suppes [1958]. 

If (A, R) is a semiorder, then as before define a binary relation J on A 
by 


alb = ~ aRb & ~ bRa. (6.1) 


If R is preference, then J is indifference. As we have observed, if R is a 
semiorder, / is not necessarily an equivalence relation as it was when R 
was a strict weak order; for J may not be transitive. 


LemMMa 6.4. If (A, R) is a semiorder and (A, W) is its associated weak 
order, then for all a, b, c € A, 


aRb => aWb (6.13) 
and 
aWbWe & alc = alb & bic. (6.14) 
Condition (6.14) is known as the weak mapping rule, and was introduced by 
Goodman [1951] in the following equivalent form: 
aWbWcWd & ald => ble. (6.15) 


Equation (6.15) says that intervals of preference cannot be contained 
within intervals of indifference. That is, we cannot prefer b toc orc to b 
and at the same time be indifferent between a and d, if a is weakly to the 


258 Nontransitive Indifference and Probabilistic Consistency 6.1 


right of b, which is weakly to the right of c, which is weakly to the right of 
d. Similarly, alternatives that are not within threshold cannot be contained 
within alternatives that are within threshold. (Proof of the equivalence of 
(6.14) and (6.15) is left to the reader.) The weak mapping rule has recently 
found applications to seriation problems in archaeology, psychology, and 
political science. See Hubert [1974] and Roberts [1979]. 


Proof of Lemma 6.4. Note that (A, R) is transitive—we already observed 
that this follows from the third semiorder axiom. Suppose that aRb. Then 
bRc = aRc and cRa => cRb, since R is transitive. This proves (6.13). To 
prove (6.14), suppose aWbWc and alc. To show alb, suppose alb is false. 
Then by Eq. (6.1), either aRb or bRa. If aRb, then bWc implies aRc, which 
contradicts alc. If bRa, then since aWb, bRa implies aRa, using c = a. 
This contradicts the first semiorder axiom. A similar proof establishes b/c. 


We say that a binary relation (A, R) and a weak order (A, W) are 
compatible if for all a, b, c © A, Eqs. (6.13) and (6.14) hold. Thus, we have 
seen that every semiorder has a compatible weak order. The converse is 
also true, if (A, R) is asymmetric, and will be useful in various applica- 
tions. 


THEOREM 6.5 (Roberts [1971b]). Let (A, R) be an asymmetric binary 
relation. Then there is some weak order on A compatible with (A, R) if and 
only if (A, R) is a semiorder. 


Proof. The semiorder axioms can be verified directly from asymmetry 
and Eggs. (6.13) and (6.14). The details are left to the reader. 


If (A, R) is a semiorder, if f:A — Re satisfies Eq. (6.4), and if W is 
defined on A by Eq. (6.11), then (A, W) is a weak order on A compatible 
with (A, R). Proof is straightforward. It is not necessarily the case that W 
satisfies Eq. (6.12). However, we shall observe in Theorem 6.6 that the 
weak order defined by (6.12) and W are “essentially” the same. 


6.1.6 Uniqueness Revisited 


There is no uniqueness theorem for the representation (6.4) which 
defines the class of admissible transformations of the function f. However, 
there is a uniqueness theorem in a different sense. Recall that in the case of 
strict weak orders (A, R), the relation J of indifference defined by Eq. (6.1) 
was an equivalence relation (Theorem 1.3). For semiorders (A, R) it is not: 
nontransitivity of J was the motivation for the concept of semiorder. We 
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introduce an equivalence relation E on A by 
aEb = (We & A)(alc © bic). (6.16) 


Two alternatives a and b are in the relation E if and only if they are 
(considered) indifferent to exactly the same alternatives c. The relation E 
turns out to be exactly the perfect substitutes relation defined from 
MW = (A, R) by the method of Section 1.8, as is easy to prove. In our 
example of Eq. (6.6), we do not have aEb for any a ¥ b. For example, 
~ xEy, since ~ xIa, but yla. To give an example that has some equiv- 
alent elements, let A = {x, y, z} and R = {(x, z), (y, z)}. Then xEy, since 
xIx, xIy, yIx, and yly. In fact, in this example, if uv, then uEv iff 
(u, 0} = (x,y). | 

Suppose W is a weak order on A compatible with the semiorder (A, R). 
Proceeding by the method of Section 1.8, we now define A* = A/E to be 
the collection of equivalence classes under E and define W* = W/E on 
A* by 


a*W*b* = aWb. (6.17) 


As usual with this procedure, W* is well-defined. Moreover, it is a simple 
order on A*, since (A, W) is a weak order. If (A, W) is already simple, 
then every equivalence class has only one element, and (A*, W*) is just the 
same as (A, W). We define R* on A* by 


a* R*b* = aRb. (6.18) 


The reader can easily verify that (A*, R*) is well-defined, that it is a 
semiorder, and that 


a*W*bh* = a*W(R*)b*, (6.19) 


where W(R*) is the weak order associated with R* by means of Eq. (6.12). 

Suppose next that W and W’ are two weak orders on A compatible with 
(A, R). If W* and W’* are defined from W and W’, respectively, by Eq. 
(6.17), then Eq. (6.19) implies that W* = W’*. It is now not hard to show 
that the only freedom allowed in the weak orders (A, W) or (A, W’) is to 
vary the ordering of points within equivalence classes under the relation E; 
this ordering within equivalence classes can be varied arbitrarily. To 
illustrate, if A = {x,y,z} and R = {(x, z), (y, z)}, then x£y, and three 
weak orders on A compatible with R are 


W = {(x, x), (», »), (2, 2), (x9), (x 2), (% 2)}, 
W' = {(x, x), (yy), (2, 2), ( x), (x 2), Cv 2)}, 


260 Nontransitive Indifference and Probabilistic Consistency 6.1 


and 


W" = {(x, x), (¥.¥), (2, 2), (9), (9s *) 4 2) (y; 2}. 


To summarize: 


THEOREM 6.6 (Roberts [1971b]). Let (A, R) be a semiorder and suppose 
(A, W) and (A, W’) are two weak orders on A. 

(a) If both (A, W) and (A, W’) are compatible with R, then if ~ aEb, we 
have aWb = aW’b. 

(b) If (A, W) is compatible with R and if, whenever ~ aEb, we have 
aWb <= aW’b, then (A, W’) is compatible with R. 


CoroLiary |. If (A, R) is a semiorder, then W(R*) defined from 
(A*, R*) by Eq. (6.12) is the unique simple order compatible with (A*, R*). 


CoROLLary 2. If (A, R) is a semiorder and (A, W) is a compatible weak 
order on A, then (A, W) is obtained from (A*, W(R*)) by ordering elements 
within equivalence classes. 


Coro ary 3. If (A, R) is a semiorder, then it has a compatible simple 
order W, 


Proof. Obtain W from W(R*) by using a simple order within each 
equivalence class. 


6.1.7 Proof of the Scott-Suppes Theorem 


We are now ready to present a proof of the Scott-Suppes Theorem. We 
have already shown in Section 6.1.2 that the semiorder axioms follow from 
the representation (6.4). Let us now prove the converse. This proof is 
constructive.* By the Corollary to Theorem 6.1, we may set 6 = I. Let 
(A, R) be a semiorder, with A finite. By Corollary 3 to Theorem 6.6, there 
is a simple order W on A compatible with R. We shall prove by induction 
on n = |A| that there is a function f:A —> Re satisfying the following three 
conditions for all a, b € A: 


aRb & f(a) > f(b) + 1, (6.20) 
aWb = f(a) 2 f(b), (6.21) 
|f(a) — f(b)| # 1. (6.22) 


*The author thanks a referee for suggesting the idea of this proof, which is different from 
the original Scott-Suppes proof. For an alternative proof, see Rabinovitch [1977]. 
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If |A| = 1, the result is clear. Thus, let us assume the result for |A| = n — 1! 
and show it for |A| = zn. 

By virtue of Corollary 2 of Theorem 3.1, we may list the elements of A 
aS a), @),...,4, in such a way that a,Wa, iff i 2 j. 


Lemma 6.7. a,Ra, => i > j and ~ a,la,. (6.23) 


Proof. The implication = follows from the definition of J and the 
compatibility between W and R, which implies that a,Wa,;. To prove <, 
note that since W is simple, i > j = ~ a;Wa;. Thus, ~ a,Ra;. | 


Let A’ = A — {a,} and let R’ and W’ be the respective restrictions of R 
and W to A’. It is clear that (A’, R’) is a semiorder and (A’, W’) is a 
compatible simple order. Thus, by the induction hypothesis, there is a 
function f:A’ > Re satisfying conditions (6.20) through (6.22) for all a, b in 
A’. We shall extend f to A by defimng f(a,). This is defined by considering 
three cases. 


Case 1. a,Ra,_. 

In this case, compatibility implies that a, Ra, for all i <n. Define f(a,) 
to be f(a,_,) + 2. It is clear that the extended function f satisfies Eqs. 
(6.20) through (6.22). 


Case 2. ~ a, Ra. 
In this case, by Lemma 6.7, a, Ja; for all i <n. Let 


A = f(a,_1) — f(a). 
By condition (6.22), A # 1, and by compatibility, a,_ ,Ja,, so A < 1. Let 


f(a,) = f(a,) + ~Z+, 


Then clearly 
[f(4,) — f(a) <1 
and 
F(4,) > £(4,-1)- 
It is easy to see that (6.20) through (6.22) now hold on A. 
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Case 3. a,Ra, and ~ a,Ra,_. 
It is clear from compatibility and Lemma 6.7 that there is an i such that 
a,Ra; for j Si, 
and 
~a,Ra, for j2it+1. 


Thus, to obtain a function f satisfying Eqs. (6.20) through (6.22), we want 
to define f(a,) so that 


F(4,) > f(a) + 1, (6.24) 
F(Q,) <f (441) + 1, (6.25) 
F(4,) > f(4,—1)- (6.26) 


Equation (6.24) comes from requirement (6.20), Eq. (6.25) from require- 
ments (6.20) and (6.22), and Eq. (6.26) from requirement (6.21). Now since 
~ a,Ra;,, and since i+1in—1 <n, we have a,,,Ja,_, and so by 
(6.20) and (6.22) on A’, 


f(a, _1) <f (4,41) + 1. 
Also, we have 
f(a) +1 <f(q4,) + 1. 


We simply choose f(a,) to be any number bigger than f(a,) + 1 and bigger 
than f(a,_,), but smaller than f(a,,,) + 1. To make a specific choice, let 


A’ = max{ f(a,_,), f(a) + 1}, 
and let f(a,) be halfway between f(a,,,) + 1 and X’; that is, let 


f(a.) = Keed + 1+ 


This guarantees that conditions (6.24), (6.25), and (6.26) hold and hence 
that f satisfies (6.20) through (6.22). This completes the proof of the 
Scott-Suppes Theorem. 


The inductive proof given above clearly gives rise to a stepwise construc- 
tion of a function f. We first define f(a,), then f(a,), etc. To illustrate, let 


A= {w, X,Y,Z, a, B, y}; 
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and let R be defined on A by Eq. (6.6). Then, as we have noted, W can be 
taken to be the simple order a, = w, ag = X, ds = Y, 4 = Z, @, = Q, a, = 
B, a, = y. 


Step 1. Arbitrarily define f(a,) = f(y) to be 0. 
Step 2. To define f(a,) = f( 8), note that Case 2 applies, and we have 


A = f(a,_1) — f(a) = f(a) — f(a) = 0. 
Thus, 
A(B) = (a) = flay) + *F* = f(y += 5. 


Step 3. To define f(a,) = f(a), note that aRy and ~ aRB, so Case 3 
applies with i = 1. Then 


N’ = max{ f(a,), f(a,) + 1} = max{4, 1} = 1. 


Hence, 


f(a) = = fay) = LAIN _ fp) + ied 5 


Step 4. To define f(a,) = f(z), note that zRy but ~ zRB, so again Case 
3 applies, with i = 1. Then 


\’ = max{ f(4;), f(a,) + 1} =] 
and 


fle) = flag) = AE 


Step 5. To define f(a;) = f(y), note that Case 3 again applies, with 
i = 2. Thus, 


a3 
w=5 


and 


#0) = fla) = LANE 1 
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Step 6. To define f(a.) = f(x), note that Case 3 again applies, with 
i = 3. Then 


eke 
eal 


and 


4 Tv 
f(z) = f(ag) = ROI ® 2, 


Step 7. To define f(a;) = f(w), note that Case 1 applies, and so 


f(w) = f(a) = f(ae) + 2 a 


To sum up, the function f is defined by Table 6.1. 


Table 6.1 
u Y B a iz y x w 


fu) 0 8/16 20/16 22/16 30/16 37/16 69/16 


6.1.8 Semiordered Versions of Other Measurement 
Representations 


Many of the representations studied earlier in this volume have ana- 
logues if we use the same idea that motivated the semiorder representation 
(6.4). For example, it is natural to study the representation 


aRb © f(a) > f(b) + 6 (6.4) 
and 


f(a ob) = f(a) + f(d), (6.27) 


a modification of extensive measurement (Section 3.2). A variant of the 
representation (6.4), (6.27) has been studied by Adams [1965], Krantz 
[1967], Luce [1973], and Krantz e¢ al. [to appear]. These authors find it 
awkward to deal with an operation 0; they use a set X and comparisons 
between finite subsets of X, with union U playing the role of 0, though 
not quite analogously. In place of (6.27), they require that 


F(a U b) = f(a) + f(O), 
providedanb=2. 


6.1 Semiorders and Interval Orders 265 


Luce [1973] and Krantz et al. [to appear] study a semiordered version of 
conjoint measurement (Section 5.4), namely, the representation 


(4), 2,)R(b,, bz) = f,(a,) + fo(a2) > f,(b,) + f:(b2) + 6. 


The representation of subjective probability measurement, which we 
study in Chapter 8, also has a semiordered analogue. This has been studied 
by Fishburn [1969], Domotor [1969], Stelzer [1967], Domotor and Stelzer 
[1971], Luce [1973], and Krantz et al. [to appear]. As of this writing, no one 
has succeeded in axiomatizing a semiordered version of the expected utility 
representations we shall study in Chapter 7. 

Any standard relation on the reals has an analogue if equality is 
replaced by “closeness” or tolerance, as in the semiorder example. Some 
examples of such relations, for example e-betweenness, are studied in 
Roberts [1973]. The geometry imposed on the reals by such relations is 
called tolerance geometry. It has potential applications in visual perception 
(Roberts [1970]; see also Poston [1971] and Zeeman [1962]). A systematic 
development of tolerance geometry has not been carried out. 


Exercises 


1. Raiffa [1968, p. 79] gives the following example. A man is indif- 
ferent between a paid vacation in Mexico (M) and a paid vacation in 
Hawaii (H). If he is offered a $1 bonus to go to M, he might still be 
indifferent between H and M + $1. Similar reasoning suggests the follow- 
ing indifferences: 


HIM + $1) 
(M + $1)/(H + $2) 
(H + $2)1(M + $3) 


(M + $9999)/(H + $10,000). 
Transitivity of indifference would imply 
H/(H + $10,000). 


[It would also imply that H/(H + $2).] Can this example be explained 
using the threshold or semiorder model? (Incidentally, Raiffa argues that 
an individual confronted with this example might very well reconsider his 
preferences and indifferences.) 
2. (a) Suppose A = {a, b, c,d} and R = {(a, d)}. Show that (A, R) is 
a semiorder. 
(b) Which of the following binary relations (A, R) are semiorders? 
(i) A = Re, aRb iffa<b —1. 
(li) A = {a, b, c,d}, R = {(a, b), (6, c), (c, d)}. 
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Table 6.2. Preferences Among Cars 
(The i, j entry is 1 iff i is preferred to j.) 


Buick Cadillac Toyota Volkswagen 
Buick 0 0 | 0 
Cadillac 1 0 1 0 
Toyota 0 0 0 0 
Volkswagen 0 0 0 0 


(iii) A= N,R=> 
(iv) A = all subsets of {1, 2, 3}, R =&. 
3. (a) Suppose A = {a, b,c, d,e} and R = {(a, b), (6, c), (a, c)}. 
Show that (A, R) is not a semiorder, but is an interval order. 
(b) Which of the examples in Exer. 2b are interval orders? 


4. Show that if an individual’s preferences are defined as in Table 6.2, 
then there is no function f satisfying Eq. (6.4). However, there is such a 
function if Volkswagen is preferred to Toyota. 


5. (a) In an experiment of Estes (reported in Atkinson, Bower, and 
Crothers [1965, pp. 146-150] and Restle and Greeno [1970, p. 241]), each 
member of a group of subjects was asked a set of questions about which of 
two distinguished personalities he would rather meet and talk to. Table 6.3 
shows the combined group preference. Does the group preference define a 
semiorder? 


(b) In asking judges to compare the taste or flavor of vanilla 
puddings, Davidson and Bradley [1969] obtained the group data of Table 
6.4. Does the group preference for vanilla puddings define a semiorder? 

(c) Consider the judgments of relative importance among objectives 
for a library system in Dallas shown in Table 1.8 of Chapter 1. If R is 
“more important than” on the set A = {a, b, c, d, e, f}, is (A, R) a semi- 
order? 

(d) Repeat part (c) for the judgments of relative importance among 
goals for a state environmental agency in Ohio shown in Table 1.9 of 
Chapter 1. 


Table 6.3. Group Preference for Meeting with Distinguished Individuals* 
(Entry i, j is 1 if more than 75% of the group members 
preferred meeting and talking to i more than to /.) 


Dwight Winston Dag William 

Eisenhower Churchill Hamerskjold Faulkner 
Dwight Eisenhower 0 0 1 1 
Winston Churchill 0 0 1 1 
Dag Hamerskjold 0 0 0 0 
William Faulkner 0 0 0 0 


*Data from an experiment of Estes (Atkinson, Bower, and Crothers [1965, pp. 
146-150]). See also Table 6.9. 
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Table 6.4. Group Preference for 
Vanilla Puddings* (Entry i,j is 1 if 
more than 60% of the group members 
preferred the taste of pudding i to pud- 
ding j.) 


Mk WN = 

oooroe 
oor oon 
or OOo = Ww 
-—Oooo-f 
oo oo ow 


*Data from Davidson and Bradley 
[1969]. See also Table 6.10. 


6. Suppose A is the set of sounds {x, y, u, v, w} and p,, as defined in 
Exer. 6, Section 3.1 gives the proportion of times a subject says that a is 
louder than b. Suppose aRb holds if and only if p,, 2.75. Is (A, R) a 
semiorder? 

7. Is every strict weak order a semiorder? 


8. Let (A, R) be the semiorder of Exer. 2a. Show that there is a 
homomorphism f:(A, R) > (Re, >s), with 6 = 1, which is irregular. 

9. Is every strict partial order a semiorder? 

10. Show that every interval order is a strict partial order. 

11. Is every strict partial order an interval order? 


12. (a) Suppose (A, R) is the semiorder of Exer. 2a. 
(i) Show that the weak order W(R) associated with (A, R) is 
given by a ranked first, then b, c tied, and then d. 
(ii) Show that all possible compatible simple orders are, from 
first element to last: a, b, c, d, and a, c, b, d. 
(iii) Find all possible compatible weak orders. 
(b) For the following semiorders, find all possible compatible weak 
orders: 


(i) A = {a, b, ¢, d, e}, 
= {(a, b), (a, d), (a, e), (5, e), (c, 2), (c, e)}. 
(ii) A = {a, b,c, d, e}, 
7 R = {(a, c), (a, d), (a, e), (5, e), Ce, e)}- 
(iii) A = {a, b, c, d, e}, 


R = {(a, e), (a, d), (c, d), (b, 4), (e, d)}. 
(c) For each semiorder of part (b), pick a compatible simple order 
and find the function f defined by the constructive method of Section 
6.1.7. 


13. Let A = {1,2} < N and let R on A be the lexicographic ordering, 
that is, 
(a, s)R(b, 1) ea>b or (a=b&s >12). 
Show that (A, R) is a semiorder, but there is no function f:A — Re 
satisfying Eq. (6.4). 
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14. Show that if (A, R) is a semiorder and W is defined by Eq. (6.12), 
then (A, W) is strongly complete. 

15. Suppose (A, W) is a weak order and / is a binary relation on A. 
Show that (6.14) and (6.15) are equivalent. 

16. Suppose (A, R) is a semiorder and E is the equivalence relation of 
Eq. (6.16). Show that E is the same as the perfect substitutes relation 
defined by the method of Section 1.8. 

17. Suppose (A, R) is a semiorder, E is the equivalence relation of Eq. 
(6.16), and suppose aEb => a = b. Show that every compatible weak order 
is a simple order. 


18. Suppose (A, R) is a semiorder, A* = A/E, and we define 
a* R*b* = aRb. 


Suppose there is a real-valued function F on A* such that for all a*, 
b* € A*, 


a*R*b* <> F(a*) > F(b*) + 1. 


Show that if f(a) is defined to be F(a*), then f satisfies Eq. (6.5). 


19. In conjoint measurement, suppose we replace the representation of 
Eq. (5.19) with the representation 


(a), 4)R(d,, by) > f(a) + fo(a,) > f(b) + fob.) + 4, 


where 6 is a positive constant. Show that it still follows that every strictly 
bounded standard sequence (Section 5.4.3) is finite.* 


20. (Roberts [1969a]) Suppose A is finite, (A, R) is a semiorder, and J is 
defined on A by Eq. (6.1). Prove that there is an x in A with the following 
property: whenever x/a and x/Jb, then alb. 


21. Since every semiorder is a strict partial order, one can talk about the 
dimension of the semiorder in the sense of Section 1.6, Exers. 12 through 
18. Figure 6.3 shows the Hasse diagram of a semiorder. Verify that its 
dimension is 3. (Rabinovitch [1978] shows that every semiorder on a finite 
set has dimension at most 3. However, Bogart, Rabinovitch, and Trotter 
[1976] show that interval orders on finite sets can have arbitrarily large 
dimension.) 


22. Exercises 22 through 30 deal with the case where we start with 
judgments of indifference only. Show that there is a function f satisfying 
Eq. (6.3) if and only if (A, J) is an equivalence relation. 


23. If we start with judgments of indifference, the representation corre- 
sponding to Eq. (6.4) is 
alb = | f(a) — f(b)| <6. (6.28) 


*The reader should take care to note that E as defined in Chapter 5 corresponds to J as 
defined in this chapter, and not to E as defined here. 
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f 
Figure 6.3. The Hasse diagram of a semiorder with dimension 3. 


Show that a binary relation satisfying Eq. (6.28) is not necessarily an 
equivalence relation. Binary relations satisfying (6.28) are called indif- 
ference graphs and are characterized in Roberts [1969a].* See Roberts 
[1978b, 1979] for recent applications of indifference graphs. 

24. (a) Show that the following binary relation defines an indifference 
graph: 


A = {x,y, u, v}, 
I= {(x, x), (y, y); (u, u), (v, v), (x,y); (y, x); (y, u), (u, y)s (u, v), (v, u)}. 


(b) Which of the following binary relations are indifference graphs? 


(i) A ={x, y, u, v}, 
- i {(x, x), (¥, y), (u, u), (0, 0), (x, 4), (u, x)}. 
(ii) ; ={x,y, u}, 
=. {(x, x), Vy, y)s (u, u), (x, y), YY, x), (y, u), (u, y)s 


(u, x), (x, u)}. 
(iil) A ={x, y, u, v}, 
I= {(x, x), (¥; ¥), (u, &), (v, v), (x, y), (y, x), (y, u), (u, y), 
(u, v), (v, u), (v, x), (x, v)}. 
{x, Jy, U, v}, 
{(x, x), (y, ¥), (u, 4), (v, v), (x, ¥), CY, x), CX, u), 
(u, x), (x, v), (v, x)}. 


*The representation (6.28) generalizes to arbitrary metric spaces (X, d). We seek a function 
f:A > X such that for all a, b € A, 


alb <> d[ f(x), f(y)] & 8. 
Not much work has been done on this representation to date, and in particular there is no 


known characterization of relations (A, J) for which such a representation exists with (X, @) 
n-dimensional Euclidean space, m > 1. For some partial results, see Roberts [1969b]. 
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25. If we start with just judgments of indifference, the representation 
corresponding to Eq. (6.9) is 


alb <= J(a) N J(b) # ©. (6.29) 


Binary relations satisfying Eq. (6.29) are called interval graphs.* The 
interval graphs have been characterized by Lekkerkerker and Boland 
[1962], Fulkerson and Gross [1965], and Gilmore and Hoffman [1964]. 
They have a wide variety of applications—to problems of genetics, 
archaeology, developmental psychology, phasing of traffic lights, ecosys- 
tems, etc. See Roberts [1976, Section 3.4] for a discussion of a variety of 
applications. 

(a) Show that every indifference graph is an interval graph. 

(b) Show that the converse is false. 


26. Which of the binary relations of Exers. 24a and 24b define interval 
graphs? 

27. Suppose (A, R) is a binary relation. The complement of (A, R) is the 
binary relation (A, R‘), where 


R‘ = {(a,b) GA X A: (a, b) E R}. 


The complement of an indifference graph is not reflexive and hence could 
not be an indifference graph. However, show that if all pairs (x, x) for x in 
A are added to the complement of an indifference graph, the resulting 
binary relation might be an indifference graph. 


28. Suppose (A, K) is a symmetric binary relation. An orientation of 
(A, K) is an antisymmetric binary relation P on A such that PU P7' = 
K. (For definitions of union U and converse ~', see Exers. 3 and | of 
Section 1.2.) The binary relation P picks one and only one of the pairs 
(x, y) and (y, x) whenever these pairs are both in the relation K and 
x #y. To give an example, if A = {x,y,z} and K = 
{(x, y), (y, x), Cv, 2), (Zz, ¥)}, then one orientation P of (A, K) is given by 
P = {(x, y), (y, Z)}. If (A, J) is an indifference graph, then one way of 
defining an orientation of the complement of (A, /) (Exer. 27) is to use the 
function f satisfying Eq. (6.28) and define 


xPy = f(x) > f(y) + 6. (6.30) 


*The representation (6.29) has a higher dimensional analogue just as does the representa- 
tion (6.28). Instead of intervals on the line, we think of boxes in n-space, n-dimensional 
rectangles with sides parallel to the coordinate axes. We ask for an assignment of a box B(a) 
to each a in A such that for all a, b in A, 


alb <= B(a) N B(b) # ©. 


This representation seems particularly hard to characterize for higher dimensions, even two. It 
seems to have a variety of applications, including some very interesting ecological ones. For 
references, see Roberts [1969b, 1976, Section 3.5, 1978a}, Cohen [1978], Gabai [1976], and 
Trotter [1977]. 
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(a) Find such an orientation P for the indifference graph of Exer. 
24a. 

(b) Verify that P defined by Eq. (6.30) always defines an orientation 
of the complement of (A, J). 


29. (a) Show that if (A, J) is an indifference graph, then its complement 
has an orientation that is transitive. (A symmetric binary relation that has 
a transitive orientation is called a comparability graph. Comparability 
graphs have been characterized by Ghouila-Houri [1962] and Gilmore and 
Hoffman [1964]). 

(b) Show that if A is finite and (A, J) is symmetric, then (A, /) is an 
indifference graph if and only if its complement has an orientation which 
is a semiorder. 


30. (a) Show that if (A, J) is an interval graph, then its complement is a 
comparability graph (Exer. 29). 
(b) Show that if (A, J) is symmetric, then (A, /) is an interval graph 
if and only if its complement has an orientation which is an interval order. 
31. A long-standing problem in the dimension theory of strict partial 
orders has been the problem of characterizing strict partial orders of a 
given dimension. (Dimension is defined in Exers. 12 through 18 of Section 
1.6.). On the basis of a result of Dushnik and Miller [1941], Baker, 
Fishburn and Roberts [1971] show that a strict partial order has dimension 
1 or 2 if and only if its symmetric complement (Exer. 12, Section 1.3) is a 
comparability graph. Use this criterion to determine whether or not the 
strict partial orders whose Hasse diagrams are given in Fig. 1.8 have 
dimension less than or equal to 2. 


32. (Ducamp and Falmagne [1969]). In Section 5.6.1, we talked about a 
set S of individuals and a set E of reactions or experiences or statements or 
test items. We talked about a binary relation R on A =(S U E) with 
R&S X E and aRb interpreted to mean that individual a agrees with 
statement b, or answers test itemn b correctly, etc. Let us distinguish two 
types of positive replies by individuals. For example, they either agree 
strongly with a statement, or agree, not necessarily strongly. Or they 
answer a test question correctly without difficulty, or they answer it 
correctly, perhaps with difficulty. Let R and T be binary relations on A 
representing these two levels of reactions, with ROSE andT¢&S XE. 
Corresponding to the semiorder idea, let us ask for functions s: S > Re 
and e:E — Re and real numbers 6 and », with 5 >», such that for all 
a€é Sandbe E, 


aRb = s(a) > e(b) + 8 (6.31) 
and 
aTb = s(a) > e(b) + 9. (6.32) 


The functions s and e generalize the notion of a Guttman scale which we 
defined in Section 5.6.1. 
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(a) Give proof or counterexample: We can find a representation 
(6.31), (6.32) with some § > 7 if and only if we can find one with 6 = 1, 
yn = 0. 

(b) Give proof or counterexample: If we can find a representation 
with some 5 > y, then we can find one with any 6 >». 

33. Ducamp and Falmagne [1969] define a bisemniorder to be a quadru- 
ple (S, E, R, T) such that R and T are binary relations on (S U E), both 
R&S X Eand TCS X E, and such that the following axioms hold for 
all a, a’ € Sand b, b’ € E: 


AxIoM BS1. If aRb, then aTb. 
AxIOM B82. If aRb, not a’Rb, and a‘Rb’, then aRb’. 


AxIoM BS3. If aTb, not a’Tb, and a'Tb’, then aTb’. 
Axiom BS4. If aTb, not a'Tb, and a’'Rb’, then aRb’. 


Axiom BSS. If aRb, not a'Tb, and a'Tb’, then aRb’. 


Show that all these axioms are necessary conditions for the representation 
(6.31), (6.32). (Ducamp and Falmagne also prove their sufficiency. For a 
more recent proof, see Ducamp [1978].) 


6.2 The Theory of Probabilistic Consistency 
6.2.1 Pair Comparison Systems 


In Section 3.1, we discussed two possible interpretations for axioms like 
those that define a strict weak order or a semiorder. These either could be 
considered conditions of rationality, which say something about a rational 
person’s judgments, or they could be considered testable conditions, sub- 
ject to experimental verification. If we look at these conditions in the latter 
sense, then we see that individuals are often inconsistent in their judg- 
ments. For example, a subject might at one time say he prefers a to b and 
not b to a, and at another time, maybe even just a bit later in the same 
session, and under circumstances that seem unchanged, say he prefers b to 
a and not a to b. (It is not clear that it is possible to present judgments 
under exactly identical conditions, since prior judgments can in principal 
influence later judgments. However, we are trying to explain behavior, and 
often individuals behave as if they are making inconsistent judgments 
under seemingly identical conditions.) Similarly, a subject may at one 
point say sound a is louder than sound 5b and 5b is not louder than a, and 
soon thereafter say sound b is louder than sound a and a is not louder than 
b. If these observations are correct, then all the measurement techniques 
we have described so far break down. For “preferred to,” “louder than,” 
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and similar judgments often do not define relations, the starting point of 
our theories. 

If subjects were totally inconsistent in their judgments, then there would 
be no hope of developing theories of their behavior. But in many cases, 
there is a pattern to the inconsistencies, so a better word than “incon- 
sistencies” would be “variations.” Although a subject may not be “ab- 
solutely consistent,” he may be “probabilistically consistent,” to para- 
phrase the terminology of Block and Marschak [1959]. We shall try to make 
precise some notions of probabilistic consistency in this section, and relate 
them to measurement. We shall formulate the theory in terms of prefer- 
ence, though we shall keep in mind other applications, in particular to 
psychophysics. 

Suppose A is a set. Let p(a, b) = p,, be the frequency with which a is 
preferred to b, that is, the proportion of times a subject says he prefers a to 
b. Obviously, similar definitions apply if judgments are of relative loud- 
ness, relative importance, and so on. The numbers p,, define a function 
p: A X A->[0, 1]. We shall assume that each time the subject makes a 
judgment, the conditions are “identical.” (As we have previously observed, 
this assumption makes it questionable whether we can ever obtain the data 
Pap: For in an experiment, once a choice has been made between a and 5, 
this choice is likely to affect the choice between a and b later on, even 
though efforts are made to mask the fact that the choice has previously 
been made.) 

The system (A, p) is called a pair comparison system. We assume that 
each time an individual makes a judgment between a and b, he is forced to 
say he prefers one to the other. Thus, for all a # b in A, we have 


Pab + Poa = }- (6.33) 
For convenience, we also assume that (6.33) holds when a = J, that is, that 
Pag = }- (It might be more natural to assume that p,, is undefined.) If the 
pair comparison system (A, p) satisfies (6.33) for all a, b in A, we call 
(A, p) a forced choice pair comparison system. 

Forced choice pair comparison systems also arise in group decisionmak- 
ing. We assume that each of a group of individuals (experts) is consistent 
in his judgments (of preference); that is, his judgments of preference define 
a relation. Then we take p,, to be the proportion of individuals who prefer 
a to b. 


6.2.2 Probabilistic Utility Models 


6.2.2.1 The Weak Utility Model 


In trying to make precise the notion of probabilistic consistency, we 
shall introduce several representations that arise if we try to perform 
measurement using data from a forced choice pair comparison system. 
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When we think of the underlying judgment as one of preference, p,, is in 
some sense a measure of strength of preference. The simplest idea is to try 
to find a function f: A > Re so that for all a, b in A, 


Pab > Pra © f(a) > f(b). (6.34) 


If (6.34) holds, then f(a) > f(b) if and only if a is preferred to b a majority 
of times. In a weak sense, f can be taken as a utility function. Following 
Luce and Suppes [1965], we shall say that a forced choice pair comparison 
system (A, p) satisfies the weak utility model if there is a real-valued 
function f on A satisfying Eq. (6.34). The weak utility model is a model of 
probabilistic consistency. 

The representation (6.34) is in a sense derived measurement, for we start 
with one scale p and derive a second scale f from p. Note that f is not 
defined explictly in terms of p, but rather satisfies a condition C(p, f), to 
use the notation of Section 2.5. A representation theorem for the weak 
utility model may be derived from Corollary 2 to Theorem 3.4.* Let us 
define a binary relation W on A by 


aWb = Pas, = Poa: (6.35) 


THEOREM 6.8. A forced choice pair comparison system (A, p) Satisfies the 
weak utility model if and only if (A, W) defined in Eq. (6.35) is a weak order 
and (A*, W*) has a countable order-dense subset. 


It is also easy to prove a uniqueness theorem for the weak utility model. 
(The reader should refer to the definitions in Section 2.5 before reading 
this theorem.) Note that p,, is an absolute scale. Thus, the narrow and 
wide senses of uniqueness for f coincide. 


THEOREM 6.9. If (A, p) is a forced choice pair comparison system and the 
Junction f: A — Re satisfies the weak utility model, then f defines a (regular) 
ordinal scale in both the narrow and wide senses. 


Proof. It is necessary to prove that if f satisfies (6.34) and : f(A) > Re 
is a monotone increasing transformation, then ¢ 0 f satisfies (6.34); and to 
prove that if g: A — Re also satisfies (6.34), then there is a monotone 
increasing $: f(A) — Re so that g = Of. These results follow from the 
uniqueness theorem for the representation (A, W) > (Re, 2 ) in Corollary 
2 to Theorein 3.4. | 


*A similar theorem may be derived from Corollary | to Theorem 3.4. But we shall use this 
version below. 
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6.2.2.2 The Strong Utility Model 


Sometimes it is desirable to obtain a stronger scale than an ordinal scale. 
Our next measurement model accomplishes this. We say that a forced 
choice pair comparison system (A, p) satisfies the strong utility model if 
there is a real-valued function f on A so that for all a, b, c, din A, 


Pab > Pea f(a) — f(b) > f(c) — f(d). (6.36) 


The idea behind this model is that the probability of making a choice 
depends on the difference in scale values. The strong utility model implies 
the weak utility model, since 


Pab > Poa > f(a) — f(b) > f(b) — f(a) 
= 2f(a) > 2f(5) 
= f(a) > f(6). 


Let us give an example of a pair comparison system that satisfies the weak 
utility model but fails to satisfy the strong utility model. Let A = {x, y, z} 
and define p,,, by the following matrix, whose a, b entry gives p,,: 


x y 2 
xf} 93 

(Po)= ff 2 4 4 (6.37) 
YL 3 2 2 
ENG 3. 3 


It is easy to see that the weak utility model holds. However, if f is a 
function satisfying Eq. (6.36), p,, = p,, implies that f(y) = f(z). Thus 
I(x) — fv) = f(x) — f(z), which implies p,, = p,,, a contradiction. 

It is of interest to mention several arguments to the effect that such 
failures of the strong utility model can occur for preferences or judgments 
of relative loudness. One such argument is based on Davidson and 
Marschak [1959, p. 237]. In preferences among different monetary 
amounts, we expect Pgsooo, s0 = 1 and Psi, 59 = 1, SO Pgsooo, 30 = Psi, so and 
therefore f($5000) = f($1). Hence, Pgsooo, 32 Should equal ps, 5», which is 
certainly nonsense. Similar problems occur often when p,, = 1. 

A second argument suggesting that the strong utility model fails is 
closely related to the Davidson-—Marschak argument. In psychophysical 
experiments, the frequency of judgments of preference, or “louder than,” 
and so on is essentially 1 if choices are far enough apart. Thus, for 
example, we can have three sounds, with a much louder (more intense) 
than 5, and 6 much louder than c, and hence p,, = p,. = p,. = 1. If f 
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satisfies Eq. (6.36), then f(a) = f(b) = f(c). But now suppose d is close in 
loudness (intensity) to a, but not to b and c. Then p,, ~+ while py, = 1, 
which is impossible. Thus, the strong utility model fails. 

In Section 6.1, we discussed several examples that were arguments 
against the transitivity of indifference. Two of the examples, the one about 
support of the arts and the one about the pony and the bicycle, can also be 
used as arguments against the strong utility model. For, suppose plan a 
would allocate a budget of 200 million dollars to a federal Institute for the 
Arts, plan 5 would allocate 200 million dollars to state institutes, and plan 
b’ would allocate 200 million and one dollars to state institutes. If you are 
indifferent between a and 5, p,, is }, so f(a) — f(b) is 0. Now similarly p,,, 
is i, so f(a) — f(b’) is 0. However, Ppp is 1, so f(b’) — f(b) is large. This is 
an impossibility. (This is really an argument against the weak utility 
model.) The argument in the pony—bicycle example is similar. (If p,, and 
Pxy ate only approximately but not exactly i, the argument does not 
work.) All these examples involve some probabilities that are 0 or 1. Some 
authors have chosen to require that Eq. (6.36) hold only for p,, # 0, 1. 

To give an example that violates the strong utility model and has no 
P.» = 9 or 1, consider the data of Table 6.5. Suppose there is a function f 
satisfying Eq. (6.36). Since 


PBeethoven, Brahms > PBeethoven, Wagner? 
we have 
f (Beethoven) — f(Brahms) > f(Beethoven) — f(Wagner), 
so 
f(Wagner) > f(Brahms). 
This implies 
f(Mozart) — f(Brahms) > f(Mozart) — f(Wagner), 


Table 6.5. Preferences for Composers* 


(The a, 6 entry shows the proportion of orchestra members interviewed 
who preferred composer a over composer 5.) 


Beethoven Brahms Mozart Wagner 
Beethoven Sot 67 .19 64 
Brahms 33 Sot 60 54 
Mozart 21 40 50t 51 
Wagner 36 46 49 Sot 


*Data from Folgmann [1933]. 
TBy assuinption. 
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or 


PMozart, Brahms > PwMozart, Wagner? 


which is false. 

As with the weak utility model, sufficient conditions for the strong utility 
model to hold can be obtained from previously known theorems by 
introducing a relation, in this case the quaternary relation D on A defined 
by 


abDed © Pay > Pca: (6.38) 


As in difference measurement, abDcd is interpreted to mean that a is 
preferred to b at least as much as ¢ is preferred to d. 


THEOREM 6.10. If the quaternary relation (A, D) defined by Eq. (6.38) is 
an algebraic difference structure (Section 3.3.1), then the forced choice pair 
comparison system (A, p) satisfies the strong utility model. Moreover, if 
(A, D) is an algebraic difference structure and the function f: A — Re 
satisfies the strong utility model, then f defines a (regular) interval scale in 
both the narrow and wide senses. 


Proof. Use the representation and uniqueness theorems of Section 3.3. 
6.2.2.3 The Fechnerian Utility Model 


A variant of the strong utility model is the Fechnerian utility model. A 
forced choice pair comparison system (A, p) satisfies this model if there is 
a real-valued function f on A and a (strictly) monotone increasing function 
: Re — Re so that for all.a, b in A, 


Pa = >[ f(a) — f(d)]. (6.39) 


The idea, as in the strong utility model, is that the greater the difference 
between f(a) and f(b), the greater the frequency with which a is preferred 
to b. The Fechnerian model is sometimes called the strong utility model, 
but we shall distinguish the two. This model appears in classical psycho- 
physics, where it arises in the attempt to relate a physical magnitude to a 
psychological one. The model implies that if p,, = p,g, then f(a) — f(b) = 
F(c) — f(d). Thus, pairs of stimuli that are equally often confused are 
equally far apart. This property was a critical assumption in Fechner’s 
derivation of the psychophysical function as a logarithmic one—see our 
discussion of Fechner’s Law in Section 4.1. (For a more detailed discussion 
of Fechner’s Law, see Luce and Galanter [1963, Section 2] or Falmagne 
[1974].) 
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Often in variations on this general Fechnerian utility model, the function 
¢ is required to be a cumulative distribution function. Then ¢[ f(a) — f(5)] 
is the probability that f(a) is larger than f(b). This general idea goes back 
to Thurstone [1927a, b], who sought scale values f(a) so that for all a, b, 


S(a)—f(b) 
Pas = f N(x) dx, (6.40) 
=o 


where N(x) is a normal distribution with mean 0 and standard deviation 1. 


A more recent variant uses ¢ as the logistic distribution $(x) = iver 
and seeks scale values f(a) so that 
ee (6.41) 
Pab = 1+ e U(a)-F@) . 5 
This model is due to Guilford [1954, p. 144] and Luce [1959]. More 
generally, one could seek to assign a random variable F(a) to each a in A 
so that 


Pap = Pr{ F(a) 2 F(b)]. (6.42) 


Equation (6.42) is called the random utility model. The interpretation is that 
the utilities are no longer assumed to stay fixed, but are determined by 
some probabilistic procedure. See Luce and Suppes [1965] or Krantz et al. 
[to appear] for a discussion. 

The Fechnerian utility model clearly implies the strong utility model, for 


Pab > Pea > 9 f(a) — f(b)] > o[ f(c) — f(4)] 
= f(a) — f(b) > f(c) — f(d). 


Thus, the examples we have given which violate the strong utility model, 
also violate the Fechnerian model. 

Finally, let us ask if the strong utility inodel is equivalent to the 
Fechnerian utility model. If A is fmite, this is true. For, suppose f satisfies 
the strong utility model, that is, Eq. (6.36). Let 


B= f(A) = f(A) = {x -—y: x =f(a),y =f(b), for somea, b € A}. 

(6.43) 
Define 9: B > Re by $(x — y) = p,,, where x = f(a) and y = f(b). Then 
¢ is well-defined and monotone increasing on B. Moreover, for all a, b in 


A, Eq. (6.39) holds. (This argument works even if A is not finite.) If A is 
finite, B is finite, and so it is easy to extend ¢ to a (strictly) monotone 
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increasing function ¢ on Re, by taking ¢ to be a linear function between 
two successive elements of B. Thus, ¢, f satisfy the Fechnerian utility 
model. However, if A is not finite, this argument does not work. The 
function @ may not be extendable to a (strictly) monotone increasing 
function on all of Re.* 

Some authors (e.g., Pfanzag] [1968, p. 173]) have chosen to define (or 
study) the variant of the Fechnerian utility model where ¢ is only required 
to be defined on B = f(A) — f(A). This model is equivalent to the strong 
utility model. Whether or not the version we have stated is equivalent 
seems to be an open question. 


*To see this, let 
a= (:} u[} 0; 
Let 
C=A~A={a-—b: a,bE€ A). 
Then 


c=(-3,-3] u(-4,0] u[o 3) ul}, 3). 
Let y: C > Re be defined by 
x+i if xe(-3,4), 
P(x) =4 x44 if x2 
x+2 if xi- 
Define p: A x A —[0, 1] by 
Pab = ¥[a — 5). 
Then we have 
Pab > Pea > Ha — 6) > He — d] 
@ea-b>c-d. 
It follows that (A, p) satisfies the strong utility model. Take f(x) = x. Let B = f(A) — f(A) 
be as in Eq. (6.43), and define ¢: B > Re by $(x — y) = p,, if x = f(a) andy = f(b). Then 
B = C and on B, ¢ = y. Now ¢ cannot be extended to a monotone increasing ¢ on all of Re. 


For $(§) =2, so (3) would have to be less than 3. However, sup{¢(x): x © B& x <1) 
3 


2, 
This argument does not mean that there are no functions ¢ and f satisfying (6.39), but only 
that this method does not lead to such functions. 
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6.2.2.4 The Strict Utility Model 


Our last measurement model has simple representation and uniqueness 
theorems associated with it. A natural requirement on a utility function f is 
that it satisfy 


___ fle) 
Pab — Fea) + FB) om 


Again following Luce and Suppes [1965], we say that a pair comparison 
system (A, p) satisfies the strict utility model if there is a real-valued 
function f on A satisfying Eq. (6.44). The strict utility model has been 
studied by many authors. The model was used by Zermelo [1929] to 
measure the playing power f(a) of a chess player a. It also has applications 
to measurement of response strength in psychology, as we shall mention 
below. See Luce and Suppes [1965, p. 335] and Krantz et al, [to appear] for 
additional references. The pair comparison system of Eq. (6.37) does not 
satisfy the strict utility model. If there is a function f on A satisfying Eq. 
(6.44), then p,, =} implies 


f(y) 1 
f(y) + f(z) 2’ 


so f(y) = f(z). Thus, 


f(x) _ fe) 
7) +f0) Fe) +/@’ 


SO Py = P,z» Which is a contradiction. 
The strict utility model never applies if any p,, is 1. For then f(b) = 0. 


f 

f(b) + f(b) 
hold only for a ¥ b is not a satisfactory solution to this problem. The strict 
utility model still rarely applies if any p,, = 1. For then f(b) = 0. Given 
any other c, p., = f(c)/f(c), which is either 1 or undefined (if f(c) = 0). 
Thus, we shall assume that for all a, b in A, p,, * 9, 1. If p,, #0, 1, then 
the “discrimination” between a and b is imperfect. If this is the case for all 
a, b, we call the pair comparison system (A, p) imperfect. 

Luce [1959] and Luce and Suppes [1965] point out that if (A, p) is 
imperfect, the strict utility model implies the Fechnerian and strong utility 
models. For suppose f satisfies (6.44). If f(a) < 0, any a, then f(b) < 0, all 


But then p,, = is undefined, instead of i, Requiring (6.44) to 
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b. Otherwise 


f(b) 
f(b) + f(a) 


is either negative or greater than 1, and both contradict p,, € (0, 1). Let 
g(a) = — f(a). Then g(a) > 0, all a, and g also satisfies the strict utility 
model. Thus, in any case, we may assume that f(a) 2 0, all a. But f(a) = 0 
implies that p,, is undefined instead of }. Hence, we have f(a) > 0, all a. 
We can now define f’ = In f. Let o be the logistic distribution function 
(A) = 1/(1 + e7>). Then 


_ f(a) 
Pab ~ F(a) + f(b) 
_ 1 
~ 1+ f(b)/f(a) 
1 
~ 1+ exp{—[ f(a) — #()]) 
= $[ f(a) — f(b], 


so f’ and ¢ satisfy the Fechnerian utility model. The strong utility model 
follows. 

It is not hard to show by example that the Fechnerian utility model can 
hold for (A, p) while the strict utility model fails. 

We now present a representation theorem for the strict utility model. We 
say that an imperfect forced choice pair comparison system (A, p) satisfies 
the product rule if for all a, b, c in A, 


PabP bcP ca = PacPcbPba- (6.45) 


The following theorem is proved in Suppes and Zinnes [1963], and we 
follow their proof. (The results also appear in Luce [1959], though not in 
one place.) 


THEOREM 6.11 An imperfect forced choice pair comparison system (A, P) 
satisfies the strict utility model if and only if it satisfies the product rule. 


Proof. Suppose the product rule holds. Let e be any element of A. Then 
define f(a) by 


S(4) = Pac / Pea 


(Since (A, p) is imperfect, we may divide by P,,.) We show that f satisfies 


282 Nontransitive Indifference and Probabilistic Consistency 6.2 


(6.44). For, given a and b, we have 


f(a) ae / Pea 
f(a) + f(b) Pac/Pea + Pre/ Per 


2 Pae/ Pea 

~ Pae/ Pea + (Poa! Pao) Poe / Pea) 
[ 1 

1+ (Psa/ Pav) 

i Rate 

Pa + Poa 

= Dap (by forced choice). 


The proof of the converse is left to the reader. a 


Remark: In psychology, imperfect forced choice pair comparison sys- 
tems satisfying the product rule are sometimes called Bradley—Terry—Luce 
(or BTL) systems, after Bradley and Terry [1952], Bradley [1954a,b, 1955], 
and Luce [1959]. (Cf. Suppes and Zinnes [1963].) Here, p,, is interpreted as 
the proportion of times a subject judges stimulus a to be greater in some 
sense than stimulus 5, and the function f is interpreted as a measure of 
response strength. 


We next state a uniqueness theorem for the strict utility model, again 
following Suppes and Zinnes [1963]. If a similarity transformation may be 
either positive or negative (but not zero), let us refer to a generalized ratio 
scale rather than a ratio scale. 


THEOREM 6.12. If (A, p) is an imperfect forced choice pair comparison 
system and the function f: A — Re satisfies the strict utility model, then f 
defines a (regular) generalized ratio scale in both the narrow and wide senses. 
If f is required to be positive, then f defines a (regular) ratio scale in both the 
narrow and wide senses. 


Proof. Note that (A, p) is an absolute scale, so the narrow and wide 
senses of uniqueness for derived scales coincide. Clearly, if a #0 and 
J’ = af, then f’ also satisfies the strict utility model. Suppose next that f’ is 
any other function satisfying this model. Fix e in A. Note that f(e) # 0, 
f'(e) #0. This follows, since p,, =4. Let a be any element in A. Then, 
since f(e) # 0 and f’(e) # 0, 

f(a) _ Pae _ f'(a) 


fe) Pea f'(e) , 
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It follows that 


f'(a) = af(a), 


where a is f’(e)/f(e), which is not zero. If f and f’ are required to be 
positive, then a is positive. a 


The four measurement models presented in Section 6.2.2 are all models 
of probabilistic consistency. They are successively more restrictive, m the 
sense that the strict utility model implies the Fechnerian utility model, 
which implies the strong utility model, which implies the weak utility 
model. Moreover, the weak utility model does not imply the strong 
utility model, and the Fechnerian utility model does not imply the strict 
utility model. As we have remarked, however, it is not known if the 
strong utility model implies the Fechnerian utility model. 

In the next section, we shall discuss some related conditions and also 
several additional situations in which these various models fail or are 
satisfied. 


6.2.3 Transitivity Conditions 


The product rule of Eq. (6.45) is a useful condition in that it can be 
subjected to test given the data about frequency of preference. Luce and 
Suppes [1965] call such a condition an observable property. A nonobserv- 
able property is a condition like Eq. (6.44), which is stated in terms of 
unknown functions. In this section, we shall state three observable condi- 
tions that are closely related to the utility models of the previous section. 
Each of these can also be thought of as a model or condition of probabilis- 
tic consistency. 

We say that a forced choice pair comparison system (A, p) satisfies 
Weak Stochastic Transitivity (WST) if for all a, b, c in A, 


Par 2 3 & Pre 2 2 > Pac 2 3: (6.46) 


In words, whenever at least } the time a is preferred to b (judged louder 
than 5), and at least ; the time 5 is preferred to c (judged louder than c), 
then at least } the time, a is preferred to c (judged louder than c). This is a 
probabilistic kind of transitivity and might be more appropriately called 
weak probabilistic transitivity—the word stochastic is used for probabilis- 
tic processes that take place over time, and there is no notion of time 
involved here. 
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(A, p) is said to satisfy Moderate Stochastic Transitivity (MST) if for all 
a, b,c in A, 


Par 2 5 & Pye Z 5 = Pac Z Min { Pay, Py }- (6.47) 


This condition is a stronger form of transitivity. It says that if a@ is 
preferred to b at least 4 the time and b is preferred to c at least } the time, 
then a is preferred to c at least as large a proportion of times as the 
minimum of p,, and p,.. Naturally, if (A, p) satisfies MST, it must satisfy 
WST. The converse is false, as is easy to show by example. 

(A, p) is said to satisfy Strong Stochastic Transitivity (SST) if for all a, b, 
cin A, 


IV 


i= 


Pab = 


Of course, SST implies MST. The converse is false. A survey of many 
other conditions of stochastic transitivity can be found in Fishburn [1973]. 
All three of the observable stochastic transitivity conditions, WST, MST, 
and SST, follow from the strong utility model (and hence from the 
Fechnerian and strict utility models). For we have the following result. 


THEOREM 6.13. For forced choice pair comparison systems, the strong 
utility model implies SST. 


Proof. Suppose p,, 2 + and p,, 2 4. By forced choice, p,, =}, SO Pas 
2 P.-. Thus 


f(a) — f(b) 2 f(c) — fle) 
F(a) — f(c) 2 f(b) — fle) 
Pac 2 Pr- 


Similarly, p,, im Paa implies p,, 2 Pap: a 


The converse of Theorem 6.13 is false. Take A = {x, y, u, v} and define 
p as follows: 


x y uv 
x/} 111 
1 
(Pa) = % o..20 70s 
00 ; $ 
v\o 0 } 3 


Then SST is satisfied, but the strong utility model fails. To see the latter, 
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note that Py = Py» 80 f(x) = f(y), and Py = Par 80 f(y) = f(u). It 
follows that f(x) = f(u) and that p,,. = p,,,, which is false. 

In the next subsection, we shall show that if A is finite, then WST is 
equivalent to the weak utility model. Thus, we can sum up as follows: 


CorRoL.ary. If A is finite, then we have the following string of implications 
and equivalences: Strict utility model => Fechnerian utility model => 
strong utility model => SST = MST => WST © weak utility model. 


In this Corollary, none of the implications are equivalences, with the 
possible exception of that from the Fechnerian utility model to the strong 
utility model. We do not know whether that is an equivalence. If A is not 
finite, then it is easy to show that WST does not imply the weak utility 
model. Indeed, even SST does not imply the weak utility model. See Exer. 
6 for an example. 

There have been many arguments (and experimental results) questioning 
conditions like SST, and hence questioning some of the utility models. We 
have mentioned some of the arguments against the utility models in the 
previous section. Here, let us mention arguments against the transitivity 
conditions. Suppose you are given a choice between a trip to Paris and a 
trip to Rome, and you have no clear preference. In fact, your frequency of 
choosing Paris over Rome is about }, say just a bit more than 4. Thus, 

' 


1 
PParis, Rome © 2° 


A travel agent offers you a “package tour” of Paris, with one dollar of 
spending money. Certainly you prefer this to Paris alone. Thus, 


PParis+$1, Paris 1. 


By SST, 


PParis+$1, Rome = |. 


That is, you always prefer Paris + $1 to Rome. On the other hand, it seems 
likely that you would still be pretty much undecided between Paris + $1 
and Rome, and so 


1 
PParis+$1, Rome 2° 


(This is very similar to the example involving support of the arts and to the 
pony-bicycle example.) 

Experimental evidence showing SST to be violated has been given by 
Tversky [1969] and Tversky and Russo [1969]. Tversky and Russo pre- 
sented subjects with a pair of rectangles and asked them to judge which 
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rectangle was larger. There were two kinds of rectangles, short fat ones and 
long thin ones. Subjects tend to be good at comparing areas of figures 
similar in shape, and not so good for figures of different shapes. Suppose a 
and b are short fat rectangles, close in area, with a@ somewhat larger. 
Suppose c is long and thin, close in area to b. In experiments, p,,, the 
frequency with which the area of a is judged to be larger than the area of 
b, tends to be almost 1, while p,. and p,, tend to be approximately i. Thus, 
SST is violated. 

In an experiment performed by Tversky [1969], subjects were given 
choices among alternatives that basically had three dimensions (cf. 
Chapter 5). The instructions suggested that if two alternatives were close (a 
term not precisely defined) on the first component, then choice should be 
made on the basis of the second and third components. Otherwise, choices 
should be made on the basis of the first component. A typical experimental 
situation involved hypothetical deans of admission, who were asked to 
judge candidates first on the basis of intellectual ability and then, if they 
were close on this basis, on the basis of emotional stability and social 
facility. In such a situation, suppose a and 6b are close on the first 
component, with a substantially better than 6 on the second and third 
components. Suppose b and c are also close on the first component, with 5 
substantially better on the second and third components. Finally, suppose 
c is substantially enough better than a on the first component. Then people 
tend to choose a over b a majority of times, b over c a majority of times, 
and c over a a majority of times. Here, even WST is violated. To give a 
numerical example, suppose a is the triple (2, 6, 8), 6 the triple (3, 4, 4), c 
the triple (4, 2, 2), and “close” means at most 1 apart. Then a is rated over 
5 most of the time, and similarly 6 over c, and c over a.* 

Another violation of SST was obtained in a less contrived experiment 
performed by Coombs [1959]. The stimuli in this experiment were 12 gray 
chips, labeled A, B, C,... ZL, and subjects were asked to compare chips as 
to which was a most representative gray.’ The data is shown in Table 6.6. 
Note that pgp = 86 and pp, = .51, but pg, = .79, violating SST. Also, 
Py = 98, Pyro = 57, and pjc = .80. There are many other violations of 
SST. 


*The Paris-Rome example, the rectangle example, and the college entrance example all 
are multidimensional in nature. In the first, there is a city dimension and a dollar dimension; 
in the second, a shape and a size dimension; and in the third, three dimensions. The 
transitivity conditions and the utility models we have been studying are most appropriate for 
“one-dimensional” sets of stimuli. However, the next example shows that even with data that 
is seemingly one-dimensional, there can be violations. In any case, the examples and 
experimental results we have mentioned suggest that a multidimensional theory of probabilis- 
tic consistency would be important to develop. 

TActually, four chips at a time were presented, and rank-ordered as to representativeness, 
and then pair comparisons were derived from these rank orderings. 
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Table 6.6. Preferences for Chips* 


(The i, j entry is the relative frequency p; , that i was judged more representative gray 

than /, by a single subject. In this and the following tables, the missing entries may 

be obtained from the forced choice assumption.) 
G F H I D E J 


B K A 


Cc L 

G 53 63 66 86 79 97 92 94 297 99 99 
F 62 62 87 87 90 9% 99 .93 99 =—-1.00 
H 54 «63 (68948089 97 99 97 
I 64 61 98 80 82 1.00 1.00 99 
D St 6 93 99 87 96 99 
E 71 94) «99 84 96 99 
J 5763 94 =-1.00 91 
Cc 93 68 93 ~=—-1.00 
B 58 84 = 1.00 
K 1.00 .73 
L 53 
A 


*Data from Coombs [1959, p. 229, Table 4]. 


However, in an experiment by Davidson and Marschak [1959], fifteen of 
seventeen subjects seemed to be satisfying SST, at least in a statistical 
sense. And Griswold and Luce [1962] and McLaughlin and Luce [1965] 
also found support for SST. 

In spite of contrary examples, the stochastic transitivity conditions are 
still widely accepted. Indeed, most experimental data seems to confirm at 
least WST and, usually, MST as well. See Luce and Suppes [1965] for a 
discussion of such data. In the next subsection, we study the stochastic 
transitivity conditions in more detail. 


6.2.4 Homogeneous Families of Semiorders 


We have already seen that some of our results about relations can be 
translated into results about pair comparison systems. For example, if the 
binary relation W is defined on A by 


aWb = Day 2 Pra (6.35) 


then Theorem 6.8 gives a representation theorem for the weak utility model 
in terms of (A, W). In this subsection, we shall prove similar theorems. 


THEOREM 6.14. If A is finite and (A, p) is a forced choice pair comparison 
system, then the following statements are equivalent: 

(a) (A, p) satisfies the weak utility model. 

(b) (A, p) satisfies WST. 

(c) If W is defined from p by Eq. (6.35), then (A, W) is a weak order. 


288 Nontransitive Indifference and Probabilistic Consistency 6.2 


Proof. We have already observed in Theorem 6.8 that (a) and (c) are 
equivalent. Now (a) implies (b), for if p,, 24 and p,. 2 i, then f(a) 
2 f(b) and f(b) 2 f(c), so f(a) 2 f(c), so p,, 2 i. Finally, we show that 
(b) implies (c). To show that (A, W) is weak, note that strong complete- 
ness follows, since either p,, 2 Ppa OT Pig & Pp Transitivity follows, since 


aWbWe = Pay 2 Pia & Poe = Pcb 


1 


=> Pay 2 ¥ & Pye 2 4 


1 
= Pac = 2 


=> Pac 2 Pra 


= awe. 
| 


As we have remarked earlier, this theorem is false without the hypothesis 
of finiteness. See Exer. 6. 

Often in experiments with loudness judgments, a is taken to be louder 
than b (or beyond threshold) if it is judged louder a sufficiently large 
percentage of the time, for example 2 of the time. A similar idea is useful 
for preference. Corresponding to any number A € [, 1), we can define a 
binary relation R, on A as follows: 


aR, b = py > X. 


Thus, aR, b if and only if @ is preferred to b a fraction of the time greater 
than A. The relation R, was introduced by Luce [1958, 1959], who proved 
that under certain conditions each (A, R,), A & [}, 1), is a semiorder. It is 
easy to prove that this conclusion follows from SST (strong stochastic 
transitivity). The proof uses the results of Section 6.1.5, specifically Theo- 
rem 6.5, which says that since (A, R,) is asymmetric, it is a semiorder if it 
is compatible with some weak order. The candidate for the compatible 
weak order is the relation W of Eq. (6.35). W is a weak order, since SST 
implies WST, which by Theorem 6.14 implies that W is weak. To show 
compatibility, we must show that 


aR,b = aWb (6.49) 
and 


aWbWe & al,c = [al,b and bi,c], (6.50) 


where 


al,b = ~ aR,b & ~ bRya. (6.51) 
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Equation (6.49) follows, since froin A 2 4, we have 


aR,b > py >A 
= Pab = Poa 
=> aWo. 


To deinonstrate (6.50), note that 


xhy © py SA and p,, SA}. 


Suppose aWbWc and al,c. Then p,, 2 Pas Pic 2 Peps aNd p,, SA. By SST, 

Pac 2 Pap @NA Py- Z Pye. Thus, pg SA and p,, SA. Since Pry SP, and 

Peb & Poor 21,6 and bi,c follow. This completes the proof of Luce’s result. 
The family of semiorders 


F = {(A, R,): AE[5, 1)} 


is special because the same weak order is compatible with each (A, R,). 
Such a family of semiorders is called homogeneous. Thus, SST implies that 
F is a homogeneous family of semiorders. The converse is also true, under 
a special assumption: for all a #b in A, p,, #3. A forced choice pair 
comparison system satisfying this assumption will be called discriminated. 
Discrimination is not a very special assumption, since if a large number of 
choices are presented between a and 3, it is unlikely that exactly half the 
time a will be chosen over b. This assumption can be replaced by the even 
weaker assumption, 


Pas = 3 => (We) Pac = Poe): 
THEOREM 6.15 (Roberts [197 la]). Suppose (A, p) is a discriminated forced 
choice pair comparison system. Then (A, p) satisfies SST if and only if 
F = {(A, R):A e[}, 1)} 
is a homogeneous family of semiorders. 


Proof. We have proved one direction. The proof of the other direction 
will be omitted. 


The condition that F is a homogeneous family of semiorders is a testable 
condition if A is finite. For in this case, F has only finitely many different 
relations (A, R,). 

The condition that (A, R,) is a semiorder, or that F is a family of 
semiorders, is again a condition of probabilistic consistency, stated in 
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terms of relations. For other related results about probabilistic consistency, 
see Block and Marschak [1959], Luce and Suppes [1965], Tversky and 
Russo [1969], Roberts [1971a], and Fishburn [1973]. 


6.2.5 Beyond Binary Choices 


The probabilistic consistency models presented here, and the measure- 
ment models presented throughout this book, are special in the following 
sense: they restrict the basic data to “binary” choices. However, more can 
be learned about an individual’s judgments by presenting him with a set B 
of more than two alternatives, and asking him to select that element a from 
B that is most preferred, loudest, etc. Suppose p(a, B) represents the 
frequency with which a is chosen when B is presented. Then p,, = 
p(a, {a, b}). A number of probabilistic consistency models starting with 
the basic data p(a, B), rather than the basic data p,,, have been developed. 
For a summary, the reader is referred to Luce and Suppes [1965] or Krantz 
et al. {to appear]. 


Exercises 


1. (a) Show that the data of Table 6.5 violates SST. 
(b) What about MST? 
(c) What about WST? 


2. Does the data of Table 6.6 satisfy the weak utility model? If so, find 
a function f satisfying Eq. (6.34). 

3. Suppose an individual is considering the alternative (a) of taking a 
new job at a salary of $30,000, as opposed to the alternative (b) of keeping 
his present job at his present salary, and suppose p,, ay (he is indifferent). 
Consider the new alternative a’ of taking the new job at a salary of 
$30,001. Argue that SST is violated. 


4. If the scale values f(a) satisfying the strict utility model represent 
response strength, show that the following statements made in terms of 
these values are meaningful provided we allow only positive scale values, 
but not if we allow negative ones: 

(a) a responds more strongly than b. 

(b) a responds twice as strongly as b. 

(c) a and b both respond more strongly than c. 

(d) The response strength of a is greater than the sum of the 
response strengths of b and c. 


5. (a) Suppose A = {x, y, z, w} and let p,, be defined by 


x sy w 

2 ieee es a 
(pp)=¥ |% 2% F 
BU gg 

WAG Pas 
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Show that (A, p) satisfies the weak utility model, but not the strong utility 
model or SST. 

(b) Suppose A = {x, y, u, v, w} is a set of sounds and p,, as defined 
in Exer. 6, Section 3.1, gives the proportion of times a subject says a is 
louder than b. Which of the following are satisfied? 

@ WST. 
Gi) MST. 
(iii) SST. 
(iv) Weak Utility Model. 
(v) Strong Utility Model. 
(vi) Fechnerian Utility Model. 
(vii) Strict Utility Model. 
6. (Luce and Suppes [1965]) Suppose A = Re X Re, let a2 B 2 3, 
and define 


if a, >b,, 
if a,=b, and a,>b,, 


if a,;=b, and a,= bh, 


— v- TR 


— Prq Otherwise. 


(a) Show that (A, p) satisfies SST (and hence MST and WST). 
(b) Show that (A, p) does not satisfy the weak utility model. 


7. (Block and Marshak [1959]) Suppose A = {1, 2, 3, 4, 5} and 
3 < Par < Ps4 < P32 < Das < Ps3 < P31 < Par < Par < Ps2 < Psi 


Show that the strong utility model fails. 


8. Give examples to show that 
(a) The strict utility model does not imply the Fechnerian utility 
model. 
(b) WST does not imply MST. 
(c) MST does not imply SST. 


9. One of the difficulties with assessing the various conditions of 
probabilistic consistency is how to decide if they are satisfied or violated. 
For example, suppose p,, = .7, p,. = -6, and p,, = .68. Is this really a 
violation of SST, or is it a statistically insignificant aberration, which arises 
only because we are using observed proportions to estimate probabilities? 
A possible statistical test of the strict utility model is based on the 
observation that if the strict utility model holds, then for all a and b in A, 


f(a)/f(b) = Pap/ Pra 


Thus, suppose the elements of A are ordered as a,,a,,...,a,, and 
suppose f(a,) is set at 1. Then f(a,_,) can be estimated as p, I Paw is 
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Table 6.7. Entry a, b Is the Proportion of Subjects Who Preferred Spending an 
Hour with a to Spending an Hour with 5* 


LJ HW CD JU CY AF BB ET SL 


Lyndon Johnson 68 70 75 78 16 6.74 «668 ~ «61 
Harold Wilson 59 70) 740 6867 52s. 
Charles DeGaulle 62 67 59 60 52 51 
Johnny Unitas 15 49 53 37 26 
Carl Yastrzemski 33 4l 31 .26 
A. J. Foyt 57 = =.39 ~——-.30 
Brigitte Bardot 29 «21 
Elizabeth Taylor 37 
Sophia Loren 


*Data from Rumelhart and Greeno [1968]. 


Similarly, f(a,_2) can be estimated as f(a,_1) X (Pa_.a,_,/Pa,_,a,..)- And so 
on. Using the numbers f(a) calculated in this way, one can calculate the 
numbers p,, for all (a, b) # (a,, @,_,) and compare the observed p,, with 
the calculated p,,, using a statistical test of goodness of fit. Unfortunately, 
estimating the values f(a) by this procedure can bias the results, for the 
estimates may be highly dependent on the particular order a,, a),..., a, 
chosen. More sophisticated statistical techniques for estimating the values 
f(a) are discussed in the literature. See, for example, Bradley and Terry 
[1952], Davidson [1970], Beaver and Gokhale [1975], or Beaver [1977]. 

(a) Table 6.7 shows the results of an experiment due to Rumelhart 
and Greeno [1968] (Restle and Greeno [1970]). Subjects were asked to 
choose which of two individuals they would rather spend an hour with. 
The reported p,, is the proportion of subjects who expressed a preference 
for a over b. If the order of alternatives is chosen as the order in which 
they are listed in Table 6.7, and f(Sophia Loren) is set equal to 1, estimate 
the values f(a) using the procedure described above and check that these 
f(a) lead to the predicted numbers p,, shown in Table 6.8. Identify those 
entries of Table 6.8 that differ significantly from entries in Table 6.7. 

(b) Table 6.9 shows data collected by Estes (see Atkinson, Bower, 
and Crothers [1965, pp. 146-150]). The entry p,, is the proportion of 
subjects who expressed a preference for meetmg and talking with a over 
meeting and talking with b.* Calculate several tables similar to Table 6.8 
by choosing several orders of points in the set A of alternatives, and then 
compare the results. 

(c) Table 6.10 shows data obtained by Davidson and Bradley [1969] 
in comparisons of the taste or flavor of different vanilla puddings. The 
entry p,, is the proportion of subjects who expressed a preference for the 
taste of a over the taste of b.t Beaver [1977] has estimated the values of the 
parameters f(a) using a weighted least-squares analysis. The values ob- 


*This is the data that was used to generate Table 6.3. 
This is the data that was used to generate Table 6.4. 
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Table 6.8. Predicted Proportion of Subjects Who Prefer Spending 
an Hour with a to Spending an Hour with & 


LJ HW CD JU CY AF BB ET SL 


LJ 68 75 83 94 88 91 80 271 
HW 59 70 8688) 8«=©=.78)=— 83 65 «53 
cD 62 8 71 76 57 44 
JU 5 60 67 45 32 
CY 33 40 21 .14 
AF 57 3524 
BB 29 19 
ET 37 
SL 


tained are as follows: 
fQ) = 0.22, f(2)=0.21, f(3) =0.17, f(4) =0.14, (5) = 0.25. 


Compute the numbers p,, that correspond to these f(a), compare with the 
values in Table 6.10, and compare with the numbers obtained if the f(a) 
are obtained by a method such as that described above. 


10. Prove that if an imperfect forced choice pair comparison system 
satisfies the strict utility model, then it satisfies the product rule. 


11. (Luce and Suppes [1965]) A forced choice pair comparison system 
(A, p) satisfies the quadruple condition if for all a, b, c, d in A, 


Pab = Pca = Pac = Poa: 


This condition was introduced by Davidson and Marschak [1959]. Show 
the following: 
(a) The quadruple condition follows from the strong utility model. 
(b) For imperfect forced choice pair comparison systems, the prod- 
uct rule implies the quadruple condition. 
(c) The quadruple condition implies SST. 
(Note: The converses of all these implications are false. Statistical tests of 
the quadruple condition have been developed by Falmagne {1976].) 


Table 6.9. Entry a,b Is the Proportion of Subjects Who Preferred Meeting and 
Talking with a to Meeting and Talking with 5* 


Dwight Winston Dag William 

Eisenhower Churchill Hamerskjold Faulkner 
Dwight Eisenhower 57 80 82 
Winston Churchill 16 80 
Dag Hamerskjold 60 


William Faulkner 


*Data from Estes (Atkinson, Bower, and Crothers [1965, pp. 146-150}, Restle 
and Greeno [1970, p. 241)). 
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Table 6.10. Preferences for Vanilla Puddings* 
(Entry a, b is the proportion of subjects who pre- 
ferred the taste of pudding a to pudding b.) 


1 2 3 4 5 
1 .20 64 -719 41 
2 .25 54 47 
3 36 -50 
4 -30 
5 


*Data from an experiment of Davidson and 
Bradley [1969]. 


12. (Luce and Suppes [1965]) Show that if an imperfect forced choice 
pair comparison system (A, p) satisfies the product rule, then for any set of 
n 2 3 distinct elements a), a,,..., a, from A, 


Pa,a,Pa,a, > ++ Paa, = Pa,a,Pa,a,— ~“ * Paay’ 


13. (Restle and Greeno [1970]) According to Restle [1961], one may 
think of a complex alternative as a set of objects or outcomes, and the 
choice between two alternatives as depending on the elements they do not 
have in common. Then, if A and B are distinct sets (of objects) and n(A) is 
the number of elements of A, the probability of choosing A over B is given 
by 


eae ABS 
Pas n(A — B) + n(B— A)" A652) 
(a) Show from the Restle model [Eq. (6.52)] that 


< n(A) — n(B) 
Pas ~ Psa = n(4 — B)+n(B—A) . 


(b) In the strict utility model, if n(A) is the measure of the strength 
of the response A, then 


= n(A) 
Pap A(A) + n(B)- 
Show from this that 
_ n(A) — n(B) 


Pap — PsBa = n(A) + n(B) ” 


(c) Show from the Restle model that if A and B are two amounts of 
money, and A is a larger amount (that is, A # B), then py, = 1. 
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(d) Let S = {A, B, C}, n(A) = 5, n(B) = 4, n(C) = 3, n(A N B)= 
2, n(A MN C) = 0, and n(B M C) = 1. Show that if p,, is calculated from 
Eq. (6.52), (S, p) violates the product rule (and hence the strict utility 
model), but satisfies SST. 

(e) On the other hand, if n(A) = 10, n(B) = 9, n(C) = 5, n(A 29 B) 
= 6, n(A NC) =0, and n(B ON C) = 3, and py, is calculated from Eq. 
(6.52), show that both the product rule and SST are violated by (S, p). 

(f) Show that if p is defined using Eq. (6.52) on all sets in a family 
S, then (S, p) satisfies the weak utility model (and hence WST). 

(g) Does (S, p) satisfy MST? 

(h) Show that if n(A M B) = n(C M D) whenever A * B and C # 
D, and if p is defined using Eq. (6.52), then (S, p) satisfies the strict utility 
model. 

14. Show that if a forced choice pair comparison system satisfies the 
strong utility model, and W is defined on A by Eq. (6.35), then 


aWb => (We)( Pac = Pre): 


15. Luce [1958, 1959] defines the trace of a forced choice pair compari- 
son system (A, p) as the binary relation T on A defined by 


aTb <> (We)( Pac 2 Poe): (6.53) 
Show that (A, T) is reflexive and transitive and, moreover, 
aTb & bTa = (Wc)( Pa. = Pre): 


16. (Roberts [1971a]). Show that if a forced choice pair comparison 
system satisfies the strong utility model, then the trace of Eq. (6.53) is a 
weak order, and it is the same weak order as the relation W defined by Eq. 
(6.35). 


17. (Roberts [1971la]). Suppose (A, p) is a discriminated forced choice 
pair comparison system. Show that SST holds if and only if the trace T of 
Eq. (6.53) is a weak order. 


18. If (A, p) is the example in Exer. 5, show that 
G = {(A, R,): \E[3, 1)} 


is a family of semiorders, but not a homogeneous family. 


19. Show that Theorem 6.15 is false without the assumption that (A, p) 
is discriminated. 


20. (Roberts [197la]). A. A. J. Marley (personal communication) has 
suggested several conditions on the family 


§ = {(A,R,): E[3, I}, 
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motivated by the semiorder axioms. Assume that (A, p) is a discriminated 
forced choice pair comparison system. % satisfies the First Semiorder 
Condition if, for all a, b, c, din A and all y, 6 in (3, 1), 


(aR,b & cRsd) = (aR,d or cRgb). 


& satisfies the Second Semiorder Condition if, for all a, b, c, din A and all 
y, 6 in [, 1), 


(aR,b & bRyc) = (aR,d or dRgc). 


F satisfies the Weak Second Semiorder Condition if, for all a, b, c in A and 
all y, 6 in (i, 1, 


(aR,b & bR,yc) = (aR,c & aRgc). 


Show the following: 

(a) The Weak Second Semiorder Condition follows from the Second 
Semiorder Condition, taking d = a and d = c and using irreflexivity of R, 
and R,. 

(b) The Second Semiorder Condition is equivalent to the statement 
that J is a homogeneous family of semiorders. 

(c) Using (b), the Second Semiorder Condition implies the First. 

(d) However, the First Semiorder Condition does not imply the 
Second. 

21. (Iversky and Russo [1969], Roberts [197la]) We say that a dis- 
criminated forced choice pair comparison system (A, p) satisfies partial 
substitutability if 


(Va, b, c)[ Pac = Pic => Pab =3|- 
We say that (A, p) satisfies the strong version of SST if 
(Wa, b, c)[ Pas 2 1/2 & py, 2 1/2 => Dae Z MAX{ Pays Pre} |; 


where strict inequality in both hypotheses implies strict inequality in the 
conclusion. 

(a) Show that for discriminated forced choice pair comparison sys- 
tems, the strong version of SST implies partial substitutability. 

(b) Show that for discriminated forced choice pair comparison sys- 
tems, SST does not imply partial substitutability. 

22. (Iversky and Russo [1969], Roberts [1971la]). The following condi- 
tions of probabilistic consistency on a discriminated forced choice pair 
comparison system are studied by Tversky and Russo [1969]: 

Substitutability : 


(Va, b, c)| Pas 2 3 = Pac z= Pre |: 
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(In particular, substitutability says that if p,. = p,, for any c, then p,, = 4.) 
Independence: 


(Va, b, Cc, da) ‘ac 2 Pre = Pad = Pra]: 


Strong Version of SST (see Exer. 21). 
Show the following: 
(a) For discriminated forced choice pair comparison systems, these 
three conditions are equivalent. 
(b) Each implies SST, but SST does not imply any of them. 

23. A forced choice pair comparison system (A, p) satisfies simple scala- 
bility if there is a real-valued function f on A and a function g:Re X Re > 
Re, with ¢ strictly increasing in the first argument and strictly decreasing 
in the second, so that 


Pas = ¢[ f(a), f(b) }. 


That is, the choice probabilities are a function of scale values f(a) and 
f(). Show that the Fechnerian utility model implies simple scalability. 
(Tversky [1972] proved that if A is finite, then (A, p) satisfies simple 
scalability if and only if it satisfies the independence condition of Exer. 22. 
More recent results along these lines appear in Smith [1976].) 


24. Exercises 12 of Section 2.6 and 27 of Section 5.4 discuss an experi- 
ment aimed at rating several stereo speakers. The top four speakers from 
the (first) ranking in Table 2.8b were used in a pair comparison experiment 
modified to allow ties, and with judgments gathered from each of four 
experts or assessors. The results are shown in Table 6.1la. Entry A means 
model A was preferred, B means model B was preferred, and = means 
indifference. Speakers were ranked on the basis of the number of prefer- 
ences for them. Table 6.11b shows the resulting ranking. 

(a) Note that one of the assessors was nontransitive in his prefer- 
ences. 

(b) If each assessor’s preferences are transitive, and define a strict 
weak order, a well-known group decisionmaking procedure is the follow- 
ing. Let f(x) be the number of elements x is strictly preferred to (ranked 
above) by the ith assessor. Rank x over y if and only if 


D(x) > ZI). 


In case of equality, declare a tie. The number B(x) = 2 f(x) is called the 


Borda count of x, after Jean—Charles de Borda, an eighteenth-century 
soldier and sailor, and one of the discoverors of the so-called voter’s 
paradox.* A problem with the Borda count is that it is possible to have 


*The paradox says that even if each assessor’s preferences define a strict weak order, there 
may be no strict weak order so that x is ranked above y in the order if and only if a majority 
of assessors rank x over y. (See Riker and Ordeshook [1973] for a recent discussion.) 
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Table 6.11. Pair Comparison of Stereo Speakers* 


(a) 
Test No. Model A Model B Assessors 

1 2 3 4 
1 Omal Goodmans A A A B 
2 Omal Marsden-Hall A A = A 
3 Goodmans Marsden-Hall A A A A 
4 Goodmans Quasar B B B A 
5 Omal Quasar A B A A 
6 Marsden-Hall Quasar B B = B 


Rank Order of Models on Basis of Number of Preferences 


Model Number of Preferences 
1. Omal 9 
2. Quasar 7 
3. Goodmans 6 
4. Marsden-Hall 0 


*From Hi-Fi & Record News, July 1975, p. 107. 


B(x) > B(y) while p,, < p,,, where p,, = the number of assessors ranking 
x over y divided by the total number of assessors. Give an example to 
illustrate this. (For further reference on group decisionmaking, see Luce 
and Raiffa [1957], Sen [1970], Fishburn [1972], or Arrow [1951].) 

(c) Restricted to A = {Goodmans, Omal, Quasar}, all the assessors 
are transitive. Show that on the set A, if p,, is defined as in (b), then the 
weak utility model is satisfied, SST is satisfied, but the strong utility model 
fails. Show that the number of preferences for a speaker defines a function 
that satisfies the weak utility model for (A, p). 

(d) Note that the ranking of speakers obtained in Table 6.11b is 
quite different from that obtained in Table 2.8b. This raises questions 
about the procedures used to obtain either of these tables. One possible 
approach to the discrepancy is to eliminate the judgments of the nontransi- 
tive assessor. Investigate how this changes the ranking of speakers ob- 
tained under the method of Section 2.6 and under the Borda count. 

25. (a) Show that if A is finite, (A, R,) is a semiorder, and J, is defined 
on A by Eq. (6.51), then (A, J,) is an indifference graph (Exer. 23, Section 
6.1). 

(b) Suppose A is a set of sounds and q,, represents the frequency of 
times that a and 6b are judged equally loud. Suppose J, is defined as 
follows: 


aJ,b <> qu 22. (6.54) 


and J, is defined on A from R, by Eq. (6.51). Suppose 
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(c) Suppose (A, p) is a forced choice pair comparison system, 
aR,b & py, > x. 
and J, is defined on A from R, by Eq. (6.51). Suppose 


qab = 1 —[ Pas + Pra]; 


and J, is defined by Eq. (6.54). Is it possible for (A, J,) to be an 
indifference graph while (A, J,) is not? 

(d) Table 6.12 shows data g,, obtained from an experiment per- 
formed by Rothkopf [1957]. Find (A, J,) as defined by Eq. (6.54) for 


(i) A= 11 
(ii) A= 2 
Gii) A = .3. 


(e) For each value of A in part (d), determine if (A,J,) is an 
indifference graph. 


Table 6.12. Judgments of Equal Loudness” 
(Entry a, 5 is the proportion of subjects 
who judged sound a and sound 5b to be 
equally loud.) 


mMoOAw> 
3 
oO 
~ 
a 
3 


“Data from Rothkopf [1957]. 
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CHAPTER 7 


Decisionmaking under Risk or 
Uncertainty 


7.1 The Expected Utility Rule and the Expected Utility 
Hy pothiesis 


In this chapter, we consider for the first time a situation of decisionmak- 
ing under uncertainty. We allow for the possibility that one of a set of 
uncertain events may occur, each with a certain probability, and each with 
a known consequence. We face such decisionmaking problems often in our 
lives. For example, when we consider whether or not to buy insurance 
before we are 30 years old, we consider the possible but uncertain event of 
death. A doctor often faces a choice among alternative treatments, with 
different uncertain side effects. A government must spend money on one 
of several technologies designed to solve a problem, each of which has only 
a certain probability of providing the desired results. (For example: What 
design should we use for a rapid transit system? Should we invest large 
amounts of money on breeder reactor research, rather than on solar 
power? And so on.) 

To motivate our discussion of such decisionmaking problems, let us 
consider a simple gambling situation.* Suppose you are attending a busi- 
ness meeting and have a choice between parking your car at a meter or 
putting it in a parking lot. The meter can take coins for up to 2 hours—the 
maximum would require $1. The meter is monitored almost constantly, 
and the fine for overtime parking is $25. The lot would be $5, a flat fee. 
You think that the chance the meeting will be over within 2 hours is 80%. 
Should you put the car in the lot? The decision you face can be 


*This example is based on Lighthill [1978] and Welford [1970]. 
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Table 7.1. 
Event Probability Payoff or Consequence to You 
Act I: Put Car on Meter 
Meeting over within 2 hours 8 —-$1 
Meeting lasts more than 2 hours 2 —($1 + $25) = —$26 
Act II: Put Car in Lot 
Meeting over within 2 hours 8 -$5 
Meeting lasts more than 2 hours 2 —$5 


summarized in Table 7.1, which gives the amount of money you have lost 
under each of the two possible actions, depending on the outside event of 
how long the meeting lasts. One way to make the decision is to compute 
the expected value of the payoff under each possible action, and choose 
the action with the larger expected value. In this case, the expected value 
of act I is 


.8(—$1) + .2(-—$26) = —$6, 
and the expected value of act II is 
.8(—$5) + .2(-$5) = —$5. 


Since —$5 is larger than —$6, you would choose the second act, and put 
your car in the lot. 

Let us change the numbers in this illustration. Suppose the lot would 
cost you $7 instead of $5. Then putting the car in the lot would no longer 
have the highest expected value—its expected value is now —$7. However, 
you might still be tempted to put the car in the lot, since by putting the car 
at a meter you risk having a large loss. This suggests, as we have pointed 
out in Sections 4.3.4 and 5.4.2, that the “value” of a sum of money is not 
necessarily proportional to its dollar amount. Rather, we should assess 
your utility of 2 dollars and compute for each act your expected utility 
(expected value of your utility function) given that the act is chosen. Here, 
we see that act I has as its expected utility 


8u(—$1) + .2u(—$26), 


while act II has as its expected utility u(—$7), where u is utility. If, for 
example, u(—$1) = — 1 and u(—$7) = — 10 and u(—$26) = — 100, then 
act I has expected utility — 20.8 and act II has expected utility —10. If we 
were to choose on the basis of expected utility, we would still choose act II. 
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Speaking very generally, suppose we think of an act or choice or gamble 
as leading to the occurrence of one of several events, A,, A,,...,A,. 
Suppose that the events are mutually exclusive and exhaustive, that is, that 


P(A, A) =0 if 4s 


and 
=p(A;) = 1, 


where p(B) is the probability that event B will occur. Associated with each 
event A; is a reward or consequence c,, not necessarily money, to which 
you associate some value or utility u(c;). The reward c,; could be a kewpie 
doll, a pass to gamble again, and so on. The expected utility of the act or 
choice is given by 


E= 5 p(A)u(c). (7.1) 


We anticipate choosing an act with the largest possible expected utility. 

The notion of expected utility of an act or choice applies to a very broad 
range of situations*. Let us apply it to a hypothetical medical decisionmak- 
ing problem. A physician is considering two possible treatments (acts or 
choices), x and y. If treatment x is used, two possible events can occur. 
They are: 


A,: the treatment works, 
A,: the treatment doesn’t work. 


Associated with these events are the consequences: 


c,: the patient is completely cured and achieves good health, 
¢,: the patient dies. 


Let us suppose that past records indicate that treatment x works one-third 
of the time. Thus, we can assign probabilities 


P(A) = 
P(A.) = 


> 


wis wie 


*This set-up still does not account for all the possible uncertainties in decisionmaking. The 
probabilities p(A;) may not be known. There may be several possible consequences c¢; 
associated with an event A;, and perhaps covered by some (unknown?) probability distribu- 
tion. And so on. The set-up also does not take into account changing information, and the 
impact of this new information on decisions. 
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Suppose that a patient assesses his utility of death as 0 and his utility of a 
healthy life as 100. (We shall discuss the utility of a life below.) Then 
u(c,) = 100, 
u(c,) = 0, 
and the expected utility of treatment x for this patient is 
E(x) = p(A,)u(c,) + p(A2)u(c2) 
= (5)(100) + 3(0) 
= 331. 
Let us now compare treatment y. Here, there are three possible events: 
Aj: the treatment works completely, 
Aj: the treatment doesn’t work, 
Aj: the treatment works partially. 
Associated with these events are the consequences: 
c\: the patient is completely cured and achieves good health, 
ci: the patient dies, 


c3: the patient becomes a cripple for life. 


According to past records, the various events following treatment y have 
the following probabilities: 


p(A}) = .0001, 
p(A:) = .4999, 
p(As) = .5000. 


Suppose the patient’s utilities are as follows: 


u(ci) = 100, 
u(cz) = 0, 
u(c,) = —200. 


(This patient would rather die than be a cripple for life. Note that 
u(c,) = u(c,), u(c,) = u(c,), for consistency.) The expected utility of treat- 
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ment for this patient is 

E(y) = p(Aj)u(cy) + p(Az)u(er) + p(A5)u(c3) 
(.0001)(100) + (.4999)(0) + (.5000)( — 200) 
= — 99.99. 


It is not unreasonable to choose between the two treatments on the basis 
of the expected utilities. That is, one chooses treatment x over treatment y 
if E(x) > E(y), and treatment y over treatment x if E(y) > E(x). In our 
hypothetical example, one chooses treatment x. This is the case even 
though treatment x leads to death with a higher probability. The reason for 
this result is that the patient would rather die than be a cripple for life. For 
a patient with a different utility function, the choice might be different.* 

The use of expected utilities to make choices can be viewed as a 
prescriptive or normative rule. We shall call it the Expected Utility Rule 
(EU Rule). This rule goes back to Damel Bernoulli [1738]. It can be 
formalized as follows: If x and y are gambles, then 


xis preferred toy iff Zp(A,)u(c,) > Zp(A/)u(c)), (7.2) 


where the first sum is over events and consequences of gamble x and the 
second is over events and consequences of gamble y. The use of expected 
utilities to make choices can also be viewed as a descriptive rule. Then it is 
asserted that individuals make choices as if they were maximizing expected 
utility. It is not necessarily claimed that the calculation of expected utility 
is conscious, Rather, given preferences among acts or choices under risk or 
uncertainty, it is claimed that we can account for the preferences by 
finding utilities and probabilities so that (7.2) holds. 

The assertion that individuals choose as if they maximize expected 
utility is known as the Expected Utility Hypothesis (EU Hypothesis). If the 
probabilities are based on subjective- assessments, then the hypothesis is 
sometimes called the Subjective Expected Utility Hypothesis (SEU Hypothe- 
sis). We shall have more to say about subjective assessments of probability 
in Chapter 8. Both the EU and SEU Hypotheses will be regarded, as have 
other descriptive measurement models, as hypotheses that are subject to 
test. In any case, the relation between the EU Hypothesis and the EU Rule 


*Whether or not the choice of treatment should depend on the patient’s utilities is a moral 
issue for medical decisionmakers. (Slack [1972] argues that, since it is the patient who has the 
major stake in the outcome of the decision, it is the patient whose utilities should be used. For 
a discussion of the use of decision theory in medical decisionmaking, see Aitchison [1970], 
Forst [1972], Ginsberg [1971], Ledley and Lusted [1959], Lusted [1968] and Schwartz et al. 
[1973]. A survey of the literature is contamed in Fryback [1974].) 
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is the following: The EU Hypothesis holds if we make choices as if we are 
following the EU Rule. 

In this chapter, we shall take both a prescriptive and a descriptive 
approach. We shall describe in Section 7.2 various ways to make use of the 
EU Rule or Hypothesis in making decisions. In Section 7.3, we shall 
discuss how the EU Rule or Hypothesis can be used to calculate utility 
functions over multidimensional sets of consequences. Then, in Section 7.4, 
we shall discuss a representation theorem which starts with preference 
among acts and gives conditions sufficient for these preferences to arise 
from comparisons of expected utilities. This theorem gives conditions on 
preferences among acts for there to be utility (and probability) functions 
satisfying Eq. (7.2). 

Before closing this section, let us make one important comment. 
Suppose probabilities p(A;) are fixed. If utilities of consequences are just 
measured on an ordinal scale, then decisions using the EU Rule are not 
meaningful (and should not be made). For example, suppose choice x has 
outcomes a or b, each with probability 3, while choice y has outcomes c or 
d, each with probability }. Suppose u(a) = 100, u(b) = 200, u(c) = 50, and 
u(d) = 300. Then E(x) = 150 and E(y) = 175, so y is chosen over x. 
However, if u’(a) = 200, u’(b) = 400, u’(c) = 10, and u’(d) = 500, then 
using u’ rather than u, E(x) = 300 and E’(y) = 255, so x is chosen over y. 
If u is just an ordinal utility function, then wu’ is obtained from u by an 
admissible transformation. Comparisons of expected utilities using utility 
functions on an interval scale are meaningful. For 


=p(Alau(c) + B] > Zp(Ajlau(c) + B) 
=> 
axp(A,)u(c) + B&p(A;) > aXp(Aj)u(c;) + BEp(A;) 
=> 


=p(A\)u(e)) > Zp(Aj)u(c;), 
using a > 0 and Zp(A,) = =p(4j) = 1. 


Exercises 


1. In comparing two television sets of equal price, we consider three 
possible events: The set will last a long time with satisfactory performance, 
the set will last a moderate amount of time with satisfactory performance, 
or the set will never work satisfactorily. For Brand x, we assign these 
events the probabilities 5, i, and z respectively. For Brand y, we assign 
them probabilities 3, i and i, respectively. If the consequences corre- 
sponding to these events have utilities 100, 10, and —200, respectively, 
then according to the EU Rule, which brand should we buy? 
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2. As we have observed, most economists believe that the utility of 
money is not a linear function of the dollar amount. Adding the same 
amount of money becomes less and less important as you have more and 
more, it is frequently argued. As we pointed out in Chapters 4 and 5, 
Bernoulli [1738] hypothesized that the utility of money is a logarithmic 
function of the dollar amount, and Gabriel Cramer in 1728 argued that the 
utility of money is a power function of the dollar amount. Let us consider 
as an example to compare these possibilities the question of whether to buy 
insurance. You have a $100,000 home. Fire insurance would cost $200. 
Suppose you think the probability of your house’s burning is .001. Thus, 
with no insurance, you lose $100,000 with probability .001, and otherwise 
you break even. With insurance, you lose $200 with certainty. Under the 
EU Rule, which of the following utility functions implies purchase of the 
insurance? 


(a) u(n) =n — 1, 
(b) u(n) = — log, in —1| if nO, 
(c) u(n) = — 2-41 if nso. 


(More on the utility of money in the next exercise and in Section 7.2.4.) 


3. Suppose you are indifferent between the following two gambles: 
Gamble 1. Play a game in which you are sure to win $15. 
Gamble 2. Play a game in which you have a 50% chance of 
winning $20 and a 50% chance of losing $12. 
Suppose u is a utility function over the set of consequences. If the EU 
Hypothesis holds with u, and u($1) ¥ 0, show that u($n) could not always 
be nu($1). 


4. An individual faces a choice of careers. If he chooses career A, he will 
either make it big or not. He has a 50-50 chance, as he sees it, of making it 
big, in which case he will have steady earnings of $30,000 annually over his 
entire career. If he doesn’t make it big, he will have steady earnings of only 
$15,000 annually over his entire career. If the individual chooses career B, 
he will either make it big early and fade later, or struggle early and make it 
big later. He thinks there is a 50-50 chance of either alternative. In the 
former case, he will earn about $30,000 annually over the first half of his 
career, but only $15,000 for the second half. In the latter case, the situation 
is reversed. Thus, all other things being equal, the individual faces in each 
career a 50-50 chance between two income patterns. In career A, the 
choice is between ($30,000, $30,000) and ($15,000, $15,000), and in career 
B between ($30,000, $15,000) and ($15,000, $30,000). If the decision is 
purely on the basis of income pattern, Fishburn [1970] suggests many 
individuals would choose career B. 

(a) Show that if the EU Hypothesis holds with the utility function u, 
then u for such individuals would satisfy 


u($30,000, $15,000) + u($15,000, $30,000) > (73) 
. u($30,000, $30,000) + u($15,000, $15,000). ; 
(b) Show that for such individuals, there can be no function f so that 


u(a, b) = f(a) + f(b). 
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(c) Indeed, show that there are not even functions f, and f, so that 
u(a, b) = f,(a) + f,(5). 

(For further discussion of multiperiod income streams, see, for example, 
Bell [1974], Fishburn [1973], Keeney and Raiffa [1976], and McGuire and 
Radner [1972]. For further discussion of additivity of utility functions, see 
Section 7.3.2.) 

5. (a) In Exer. 4, suppose A, = A, = {$15,000, $30,000}, R is the bi- 
nary relation of preference on A, X A}, and wu is an ordinal utility function 
for R satisfying (7.3). What possible R’s satisfy these conditions? 

(b) Which of the above R’s define an additive conjoint structure 
(Section 5.4)? 


6. (a) Suppose you have invested in a stock of XYZ Company, and the 
stock has gone down in price. You could sell it now at a loss of $225. Let 
us call this Strategy I. You also have two alternative investment strategies 
(involving different choice of lower and upper selling prices for the stock). 
Strategy II, according to your assessment, would lead to an ultimate loss of 
$425 with probability .65, a loss of $125 with probability .25, and an 
ultimate profit of $275 with probability .10. Strategy III would lead to an 
ultimate loss of $325 with probability .75, an ultimate loss of $125 with 
probability .2, and an ultimate gain of $275 with probability .05. All 
probabilities are your subjective assessments and may not be completely 
accurate. The dollar amounts are small enough so that it might be 
reasonable to assume that the utility of $2 is nu($1), or 2 units. Discuss the 
alternative strategies, both from your intuitive feeling and from the point 
of view of expected utility. Note that subjective probabilities are only 
estimates, and expected gains are close, and there is a measure of risk in 
both Strategies II and III. This might affect choice of strategy. 

(b) It is interesting to analyze these three strategies from your present 
position. With Strategy I, you break even for sure. With Strategy II, you 
lose $200 with probability .65, gain $125 with probability .25, and gain 
$500 with probability .1. With Strategy III, you lose $100 with probability 
.75, gain $125 with probability .2, and gain $500 with probability .0S5. If the 
decision is considered from this new point of view, observe that one’s 
preferences may be different from those expressed about the previously 
formulated problem. What does that say about utility if the EU Hypothesis 
holds? 


7.2 Use of the EU Rule and Hypothesis in Decisionmaking* 
7.2.1 Lotteries 
In the previous section, we introduced the notion of expected utility and 


the Expected Utility Rule and Hypothesis. In this section, we present 
several tools or methods for help in making complex decisions, assuming 


*The discussion in Sections 7.2 and 7.3 is based heavily on Raiffa [1968, 1969]. 
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the EU Rule or Hypothesis holds. In particular, we shall introduce the 
notion of a lottery, we shall see how to use the EU Hypothesis and lotteries 
to calculate a utility function, we shall introduce the notion of a basic 
reference lottery ticket (brit) and we shall apply lotteries and brit’s to the 
calculation of the utility of money and to certain public health questions. 

In Section 7.1, we associated with each act or choice a collection of 
possible events (mutually exclusive and exhaustive) and a consequence 
corresponding to each event. We spoke of the probability of an event and 
the utility of a consequence. We can suppress all inention of the events and 
speak only of the consequences. Then we speak of the probability of a 
consequence as well as its utility. It is convenient to think of an act or 
choice as a “lottery” with consequences c,,c3,...,C,, Consequence c, 
attained with probability p,. We might represent such a lottery using a tree 
diagram with the end points of the branches labeled with the consequences 
and the branches themselves labeled with the probabilities. Such a diagram 
is shown below: 


P, 
P, a 
Ps 

C3 
DP, 

Cy 


By our assumptions about the events associated with a given act or choice, 
P, + pp +--+ +p, = 1, an assumption we shall always make for our 
lotteries. 

Suppose K is a set of consequences and L is a collection of lotteries with 
consequences in K. We let R be a binary relation on L, the relation of 
(strict) preference. (We assume R is the preference relation of some 
decisionmaker or decisionmaking body.) We define weak preference S and 
indifference E as usual, namely, by 


eSt’ <> ~ CRE, (7.4) 
CEU’ <> ~ ORO’ & ~0'RE. (7.5) 
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Any function u: K > Re will be called a value function on K, and it will be 
convenient to distinguish value functions on K from utility functions, 
which are value functions on K with certain special properties, for exam- 
ple, the property of preserving certain observed relations on K. 

We say that the triple (K, L, R) satisfies the Expected Value (EV) 
Hypothesis if there is a value function u on K with the following property: 
Whenever f and ?’ are in L and ( is the lottery 


Cy 


C2 


and (’ is the lottery 


then 


eRe’ = = Pit) > 2 qu(d,). (7.6) 


We shall say u is a value function satisfying the Expected Value (EV) 
Hypothesis. The first sum in Eq. (7.6) is denoted E(f) and is called the 
expected value of the lottery (. If there is a utility function u on K satisfying 
the EV Rule, we say that the Expected Utility (EU) Hypothesis holds, that 
u satisfies the Expected Utility (EU) Rule, and that E(¢) is the expected 
utility of the lottery ¢. 

Recall that the statement E(?) > E(¢’) is meaningless if utility u is only 
an ordinal scale. That does not mean that the EU Hypothesis is meaning- 
less or false if utility is only an ordinal scale. For the EU Hypothesis 
simply says that individuals act by making such comparisons (for a 
particular utility function); the EU Hypothesis is not concerned with the 
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meaningfulness of the comparisons, nor is it affected by the issue of 
meaningfulness. In other words, even though the statement E(¢) > E(¢’) 
might be meaningless, it is still interesting to explore this statement and the 
EU Hypothesis if u just defines an ordinal scale. 

Sometimes it is convenient to speak of complex lotteries. For example, 
suppose lottery ¢ is as follows: 


$50 


Lottery ¢ gives a 50% chance of winning $50 and a 50% chance of losing 
$50. Consider next the lottery* 


—$60 


Suppose you are offered a gamble in which, with probability .5, you win 
$50, and with probability .5, you are entered in the lottery ¢’. We can 
represent this complex lottery as a lottery ¢” in two ways: 


~$60 


By using standard properties of tree diagrams, which we shall at this point 
assume hold for our lotteries, we may translate ¢” into a simple lottery 
with the same expected utility. 


*The reader should note that ¢ used in the text and 2 used in the art are the same symbol, 
even though they differ slightly in appearance. 
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—$60 


Before closing this subsection, we observe that preference among 
lotteries may not be transitive. Consider the following lotteries: 


~$100 Ge 545 


In ¢,, there is a 50% chance of breaking even and a 50% chance of losing 
$100. In £, you lose $45 for certain. In (, you lose $45 for certain and then 
enter a lottery that gives you $45 with probability .5 and takes $50 with 
probability .5. According to Raiffa [1968, p. 75], many subjects prefer (, to 
é,, even though, if utility satisfies u($n) = nu($1), E(0,) < E(Q). They 
would rather gamble on breaking even than losing $45 for certain. This 
violates the EU Rule. Many of these subjects also prefer 6, to 0. Since the 
gamble for $45 versus — $50 is unfair, they would rather not take it. Thus, 
for many subjects, ¢, R0, and ¢,Rt,. However, for many subjects, ?, R¢,. For 
f, gives a 50-50 chance at $0 or —$95, which is better than a 50-50 chance 
for $0 or —$100. This is a violation of transitivity of preference, a 
condition that follows from the EU Rule. When non-transitivity is pointed 
out to such subjects, some become uncomfortable with their choices, 
whereas others will not change their minds, saying that these original 
choices represent their true preferences, even though this seems illogical. 
We mention another example of violation of the EU Rule in Section 
7.2.4, and other tests of the EU Rule and Hypothesis in Sections 7.2.2 and 
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7.3.2. For summaries of tests of the EU Rule and Hypothesis, see Becker 
and McClintock [1967], Coombs, Dawes, and Tversky [1970, Chapter 5], 
Edwards [1954, 1961], and Luce and Suppes [1965]. 


7.2.2 Utility from Expected Utility 


In Section 3.1, we studied ordinal utility functions, real-valued functions u 
on relational systems (A, R) so that for all a, b € A, 


aRb = u(a) > u(b). (7.7) 


For this chapter, we shall call a function u satisfying (7.7) an order-preserv- 
ing utility function, so as not to confuse such functions with ordinal scales; 
the order-preserving utility functions we shall study might have properties 
in addition to satisfying (7.7), and so might define scale types stronger than 
ordinal. We shall also speak of full utility functions, utility functions 
satisfying all the properties we want to demand of them, and we note that 
order-preserving utility functions might not be full. In Section 3.1, we 
observed that an order-preserving utility function u could be calculated by 
using a pair comparison experiment and then setting 


u(x) = number of y so that x is preferred to y. 


If there are many alternatives, performing a pair comparison experiment 
becomes unworkable: With n alternatives, n(n — 1) ordered pairs must be 
tested, or n(n — 1)/2 unordered pairs”. In this section, we shall see that if 
the EU Hypothesis is true, then we can use it to find an order-preserving 
utility function among consequences more rapidly than by performing a 
pair comparison experiment. 

Suppose > is a strict preference relation on the set K. We would like to 
find an order-preserving utility function on (K, >), that is, a function 
u:K—Re satisfying 


c>d<u(c) > u(d). (7.8) 


*Because of this, experimenters take various shortcuts in performing a pair comparison 
experiment. For instance, they assume such properties of preference as transitivity, and use 
transitivity to extrapolate some preferences from those previously expressed. See Warfield 
[1973, 1974a,b, 1976] and Baldwin [1975] for examples of this approach. There has been a fair 
amount of theoretical work devoted to this question of how to design a pair comparison 
experiment so as to minimize the total number of questions which must be asked if, for 
example, it is assumed that preferences form a strict simple order. For some discussion of this 
problem, see for example Knuth [1973], Wells [1971 pp.206ff.], Busacker and Saaty [1965, pp. 
228ff], Ford and Johnson [1959], and Steinhaus [1950, pp. 37-40]. 
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We shall define weak preference > and indifference ~ on K by 
cpedea~d>c 
and 


ecmdan~crd&~d>c. 


In a sense, each element in the set K is a lottery, for we can identify a 
consequence c in K with the lottery that gives c with probability 1. This 
lottery will be denoted ((c). Throughout this chapter, we shall assume that 
each such lottery is in L. If R is a strict preference relation on L, we shall 
also assume that for all c,d € K, 


c>d © U(c)Ri(d). (7.9) 


If these two assumptions hold, we say (L, R) extends (K, >). If (L, R) 
extends (K, >), then any value function u on K satisfying the EV Rule is 
an order-preserving utility function for (K, >). For 


c>d# U(c)Ri(d) 
= E[t(c)] > E[e(d)] 
= u(c) > u(d). 


We shall assume the EV Hypothesis, that is, that there is a value function u 
on K satisfying the EV Rule. We shall not assume that u is known, but 
only that it exists. 

Assume that there are two elements of K, cy, and c*, so that 


c*>cy (7.10) 
and so that for all c in K, 
~c>c* and ~cy>c. (7.11) 


That is, c* is strictly preferred to c,, nothing is strictly preferred to c*, and 
Cc, is not strictly preferred to anything. The consequences c* and c, can be 
thought of as the “best” and “worst” consequences in K. Since c* >c+ and 
u is an order-preserving utility function, we have u(c*) > u(c+), and hence 
u(c*) — u(cx) > 0. 

Given c in K, let (c) be the lottery with only one consequence, c. We 
are assuming that (c) always belongs to L. Consider the lottery 

tT c* 


g. (7.12) 


7.2 Use of the EU Rule and Hypothesis in Decisionmaking 319 


Since (L, R) extends (K, >), if 7 is 1, you either prefer £, to €(c) or are 
_ indifferent between these lotteries. If 7 is 0, you either prefer ((c) to €, or 
are indifferent between these lotteries. As we let 7 gradually increase from 
0 to 1, it is reasonable to assume that we can find some number 7 = (c) 
so that you are indifferent between f,,.. and f(c).* Then z(c) defines an 
order-preserving utility function over K. For 


E(l,¢e)) = w(c)u(c*) +[ 1 — a(c) ]u(cs) 
= a(c)[u(c*) — u(cs)] + ules). 


Since you are indifferent between f,,.. and f(c), the EV Hypothesis implies 
that 


E[&(c)] = El bac]; 
or 
u(c) = m(c)[ u(c*) — u(cs)] + ules). 


Since u(c*) — u(c+) # 0, we may divide by u(c*) — u(cs), and we obtain 


u(c) — u(cs) 
0 ~ er) — ules) 
= au(c) + B, 
where 
a 1 
ulcer) = ules) 
and 


B a - u(cs) 
u(c*) — u(cs) 
Since a > 0, it follows that for all c and d in A, 
u(c) > u(d) = a(c) > x(a). 


Since u is an order-preserving utility function over K, we have 


c>d<=n(c) > x(a). 


*The existence of such a number 7 requires an assumption about preferences among 
lotteries, which we shall formalize below. 
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Thus, 7 is an order-preserving utility function over K. Notice that com- 
putation of the utility function 7 only assumes that u exists, and does not 
require knowledge of u. Also, there is no need to assume that the individ- 
ual whose utility function is being calculated actually computes expected 
values or utilities to make decisions, but only to assume that he acts as if 
he makes decisions on the basis of expected utilities. Finally, calculation of 
the number z(c) is required only n times if there are n consequences being 
compared, 

In principle, the procedure is simple. To illustrate it, suppose the “best” 
and “worst” consequences of a given surgical procedure are “cure” (c*) 
and “death” (c«). To calibrate an individual’s utility function for various 
complications—for example, loss of a leg—we would consider gambles 
with a probability 7 of cure and | — m7 of death. We would gradually 
increase 7 until the individual told us he was indifferent between the 
gamble on the one hand and the complication—loss of leg, for ex- 
ample—with certainty. (This involves some hard choices!) The value of 7 
for which he is indifferent can be used as his utility for the complication, 
and used in later decisionmaking. 

Perhaps a more down-to-earth example is the following: Suppose an 
individual is considering how much money to invest on a risky venture if 
his payoff is $1000 if things go well and $0 otherwise. He can calculate his 
utility of, say, $400 by finding that value of 7 = ($400) for which he is 
indifferent between having $400 for certain and having a 7-probability of 
obtaining $1000, a (1 — 7)-probability of $0. Note that z($n) is almost 
certainly not nm($1). For example, ($500) is almost certainly greater than 
1/2, so a($1000) < 27($500). We shall ask the reader to think about the 
meaningfulness of this conclusion. See Section 7.2.4 for more on the utility 
of money. 

To the best of the author’s knowledge, the procedure we have described 
is due to Raiffa [1969]. Raiffa admits that the procedure is not really 
practical in complex decisionmaking problems. However, it contains the 
basic idea which he does find useful in decisionmaking, and which we turn 
to in the next subsection. Keeney and Raiffa [1976] argue that a more 
practical procedure to assess a utility function is to use this procedure to 
fix a few utility values and then to fit a curve to these data points. The 
general shape of the curve is estimated by studying an individual’s atti- 
tudes toward risk. Some examples of computation of utility functions using 
variants of this idea are the following: Grayson [1960] computes utility 
functions of oil wildcatters involved in the search for gas and oil; Swalm 
[1966] and Spetzler [1968] compute utility functions of business executives; 
and Keeney [1972a] computes the utility function of operators of a hospital 
blood bank. For a discussion of these and other examples, see Keeney and 
Raiffa [1976]. 


7.2 Use of the EU Rule and Hypothesis in Decisionmaking 321 


Let us now formalize the result we have obtained. We say that 
(K, >, L, R) satisfies continuity if, for all c, d, d’ belonging to K, whenever 
d>c > d’, then for some p € [0, 1], the following lottery ¢ 


is in L and ((c) is indifferent to €. (We do not use the full force of 
continuity, but use it only with d = c* and d’ = cs». The reader should 
compare the notion of continuity used in Theorem 5.1.) 


THEOREM 7.1 (Raiffa). Suppose K is a set, > is a binary relation on K, L 
is a collection of lotteries with consequences in K, and R is a binary relation 
on L. Suppose in addition that 

(a) (K, L, R) satisfies the EV Hypothesis; 

(b) (L, R) extends (K, >); 

(c) (K, >) has “best” and “worst” consequences c* and cx satisfying Eqs. 
(7.10) and (7.11); and 

(d) (K, >, L, R) satisfies continuity. 

Then an order-preserving utility function 7 for (K, >) is given by taking m(c) 
equal to that number 1 so that 0(c)Et,, where E (indifference) is defined in 
Eq. (7.5) and 0, is the lottery 


Recall again that comparisons of expected utilities are meaningless if the 
utility function over consequences is only an ordinal scale, but are 
meaningful if this utility function is an interval scale. We shall show that 
under reasonable assumptions, an order-preserving utility function actually 
defines an interval scale. 


COROLLARY 1. Suppose hypotheses (b) through (d) of Theorem 7.1 hold, 
and in addition the EU Hypothesis holds. Suppose every full utility function* 
on K is regular and defines an interval scale. Then the order-preserving utility 


*Recall that we use the adjective “full” to distinguish order-preserving utility functions 
from utility functions. The latter may be required to satisfy more conditions than just 
order-preservation. 
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function w defined in Theorem 7.1 is a full utility function on K, and it 
defines a (regular) interval scale. 


Proof. By the proof of Theorem 7.1, if uv is a full utility function on K 
satisfying the EU Hypothesis, then a(x) = au(x) + B, a > 0. Since u 
defines an interval scale, 7 must be a full utility function for K. Hence, it 
defines an interval scale. | 


COROLLARY 2. Suppose hypotheses (b) through (d) of Theorem 7.1 hold. 
Suppose in addition that every full utility function u on K satisfies the EU 
Hypothesis. Finally, suppose that every order-preserving value function on K 
that satisfies the EV Hypothesis is a full utility function. Then every full 
utility function u on K defines a (regular) interval scale. 


Proof. Given u, v = au + B is again a full utility function, if a > 0. For 
v clearly is order-preserving and satisfies the EV Hypothesis. Next, given u, 
suppose v is any other full utility function on K. By the proof of Theorem 
7.1, there are constants a, B, y, 5, with a, y > 0, so that 


a(x) = au(x) + B 
and 
a(x) = yo(x) + 6. 


Thus, for all x, 


v(x) = Zu(x) + 


B-8 
ae 


In closing this subsection, let us note that there are various alternative 
ways the EU Hypothesis can be used to calculate utilities over con- 
sequences. Some of these methods have been developed experimentally 
and, incidentally, provide experimental tests of the EU Hypothesis. See, 
for example, Mosteller and Nogee [1951], Davidson, Suppes, and Siegel 
[1957], or Coombs and Komorita [1958]. 


7.2.3 A Basic Reference Lottery Ticket 


Let us fix the same two consequences c+ and c* described in Egs. (7.10) 
and (7.11). A ticket to enter the lottery (7.12) will be called a 7-basic 
reference lottery ticket or a m-brit for short. To make a choice between two 
complicated lotteries ? and (’, Raiffa [1968, 1969] suggests that we reduce 
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them to -brit’s. Namely, if c is a consequence and ((c)Ef,,,), then replace 
c in lotteries ¢ and ¢’ by the lottery (,,.). For example, the lottery 


becomes the lottery 


Using standard properties of tree diagrams (which we will again assume 
hold), we can replace ? by the simple A-brlt 


where A = (.8)a(c) + (.1)a(d) + (I)a(e). We have CE. Similarly, we re- 
duce an alternative lottery (’ to a -brit (;, with (’‘Ef,. Then, if preference R 
is strict weak, we should have 


PRE’ <> BRop. 
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If the EV Hypothesis holds, we should have 
HRls > r> 4, 


since c* >c+. The point of this procedure is that you can make choices 
between lotteries or acts or choices without knowing a utility function over 
consequences. 

To illustrate how to make decisions using the idea of a 7-brit, let us give 
some examples. Suppose a small businessman faces decreasing sales in his 
present location and considers moving to a new location. Suppose for want 
of further information, he thinks it is equally likely that one of the 
following will happen if he moves: 

a = he will increase his sales, 

b = his sales will stay at their present level, 

c = his sales will continue to decrease, 

d = he will face bankruptcy. 

Thus, the businessman faces a choice between the lotteries ¢(c) and 


Suppose he chooses c* = a and cs = d, and, on reflection, he is indifferent 
between ((5) and 


a .5-brIit; and he is indifferent between ((c) and 


.25 a 


75 
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a .25-brit. Then he is indifferent between ¢’ and 


1—brit 


.5—brit 


.25—bdrit 


0—bdrit 


which is the same as a .4375-brit, since 
.25(1) + .25(.5) + (.25)(.25) + (.25)(0) = .4375. 


Since .25 is smaller than .4375, the small businessman should choose (¢’ 
over ((c)——that is, he should choose to move.* 


7.2.4 The Utility of Money 


As we remarked at the beginning of this chapter, the utility of money is 
not necessarily a linear function of the dollar amount. If it were, most 
people would not buy insurance. For the expected dollar loss from buying 
insurance is usually greater than the expected dollar loss from not buying 
it. (See Exer. 2, Section 7.1.) As the same dollar amount is added, at higher 
and higher levels of holdings, the utility increases by less and less. (We say 
the marginal utility is decreasing.) This was first observed by Bernoulli 
[1738], who suggested that the utility of $n is a logarithmic function of n. 
(See Exer. 2 below.) 

How might we estimate an individual’s utility function for money? The 
most direct method is to use a procedure like that of Section 7.2.2. We fix 
the lowest and highest dollar amounts in reason, for example, $0 and 
$1,000,000. We then find that number 7(n) so that an individual would pay 
exactly $n to enter a lottery of the form 


n(n) $1,000,000 


1—7(n) 
$0 


*The point of this approach is that this is how the businessman should behave (according 
to Raiffa), not how he does behave. The method of 7-brit’s is very much a prescriptive one. 
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mn) 


$0 $250,000 $500,000 $750,000 $1,000,000 


Figure 7.1, Estimation of a utility function for money 


Then z(n) is the utility of $n. Often it is sufficient to estimate a(n) for 
several intermediate values of n, and then fit a curve to these values, 
provided one also checks the general shape of this curve. For most 
individuals, the shape will be concave-down, for most individuals tend to 
be risk-averse. (See Fig. 7.1 for a sample estimated utility curve for money 
for such an individual.) Risk-averseness means that if c, d, and a are 
monetary amounts and an individual is indifferent between £(a) and 


L(c, d) 


then a S (c + d)/2. If 7 is the function so that (x) is indifferent to £,,,), it 
follows that ((c, d) is indifferent to a A-brlt with A = .Sa(c) + Sa(d). 
Thus, 


(£ : 4) ee m(c) * a(d) 


Hence, the function 7, the utility function for money, is concave-down. 


7.2 Use of the EU Rule and Hypothesis in Decisionmaking 327 


Stevens [1959, p. 55] argues that (a) is a power function aa®, with B in the 
vicinity of .4 or .5. Such a function has the concave-down shape. Appar- 
ently the idea that the utility of money might be a power function goes 
back to Gabriel Cramer in the eighteenth century (see Bernoulli [1738] and 
Stevens [1959, p. 58].) 


7.2.5 The Allais Paradox 


Consider the following four lotteries: 


5,000,000 
ioe 
gt 1 000,000, g(2) 89 $1 000,000 , 
01 
$0 
a5 $5,000,000 ai $1,000,000 
g() 90 » gf) 89 
$0 $0 


We shall pose two problems: Problem 1 is to choose between ( and ¢®, 
and Problem 2 is to choose between (® and ¢., The French economist 
Allais [1953], and others, have reported that most subjects prefer (“ to ¢ 
and ¢® to (®, To quote Raiffa [1968, p. 80], most subjects reason as 
follows: “In Problem 1, I have a choice between $1,000,000 for certain and 
a gamble where I might end up with $0. Why gamble? In Problem 2, there 
is a good chance that I will end up with $0 no matter what I do. The 
chances of getting $5,000,000 are almost as good as getting $1,000,000, so I 
might as well go for the $5,000,000 and choose” ¢® over ¢. 

Following Raiffa, let us try to analyze Problems 1 and 2 by using 
m-brit’s. We let c* be $5,000,000 and cs» be $0. Then ¢ is (judged) 
indifferent to a 7,-brit, for some 7,; that is, 0“ is indifferent to t,,, Now 
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¢® is indifferent to 


10 g 

.89 Q = 
Ty 

O01 
Q" 


which is the same as 


1, $5,000,000 


™ l-n, 


$0 


$5,000,000 


$0 


where 7, = .10 + .892,. Similarly, ( is indifferent to €,, where 7, = .10, 
2 1 Ts 3 


and ( is indifferent to (,,, where 7, = .117,. Thus, 


(ORI <2, >, 
7, > .10 + 897, 

oT, > ” 

a eae 


and 


(ORI 1, > 1%, 
= 10> lin, 
10 


adil Marte 


Thus, decisionmakers who make choices using the method of 2-brit’s 
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cannot make choices of both ¢ over @ and (© over (. Decisionmakers 
who would make these choices would violate the EU Hypothesis. For if u 
is a utility function on K satisfying the EU Hypothesis, then 


( RE® <> u($1,000,000) > .10u($5,000,000) + .89u($1,000,000) + .01 u($0) 
€ .11u($1,000,000) > .10u($5,000,000) + .01 ($0) 


and 


02) RE <> .10u($5,000,000) + .90u($0) > .11u($1,000,000) + .89u($0) 
<> .10u($5,000,000) + .01u($0) > .114($1,000,000). 


Raiffa argues that it is exactly such examples that illustrate why people 
make mistakes in decisionmaking. Thus, they should make decisions by 
using a precise procedure such as the EU Hypothesis, or the method of 
m-brit’s, to avoid making “inconsistent” judgments. Others have tried to 
argue that the Allais Paradox is not really a paradox, for it does not really 
exhibit a violation of the EU Hypothesis. Morrison [1967] argues this way, 
arguing that $0 plays a different role in different parts of the two problems. 
Note that the Allais Paradox is a paradox regardless of what the utility 
function for money may be. 


7.2.6 The Public Health Problem: The Value of a Life.* 


In this subsection, we apply some of the ideas of earlier subsections. 
Suppose under present conditions a certain proportion p of the population 
will die in a given year of some disease—let us say, for concreteness, 
respiratory disease due to air pollution. Suppose a study has shown that by 
spending x dollars over the next year, we can reduce the proportion of 
people who die from p to p’. How does society decide whether to spend 
that x dollars? Let us call this question the public health question. 

Naturally, the public health question is usually not considered in iso- 
lation—there are usually alternative possible uses for the same amount of 
money, and one would usually compare these alternative uses. However, 
one reasonable alternative is to keep the money for later spending, and one 
would at least like to be able to decide if it is worth not keeping it, that is, 
if it is worth spending x dollars to decrease p to p’. What is the tradeoff 
between health and (monetary) assets? This is the question we consider. 

Suppose we let s be the sum total of society’s assets at the present time. 
Assets will be measured, for the purposes of our discussion, in monetary 
terms. (They may, of course, include resources, capital, etc.) Let us 


*The discussion in this subsection is motivated by that in Raiffa [1969, Section 8]. 


330 Decisionmaking Under Risk or Uncertainty 7.2 


consider first the consequences for a particular individual of spending x 
dollars on air pollution reduction, if this will reduce p to p’. Suppose there 
are N people in all. A particular individual’s share of the expense, in the 
most simple-minded model where wealth is shared equally, will be (1/N)x. 
His share of the initial assets is (1/N)s. The numbers q = 1 — p and 
q’ =1-p’ can be interpreted as the individual’s “probability of life.” 
Thus at present he faces the following lottery ¢: 


Life, with assets s/N 


If x dollars are spent, he will face the lottery 0’: 


Death 


Life, with assets (s—x)/N 


Assuming the EU Hypothesis, letting u be an order-preserving utility 
function which satisfies this hypothesis, we may assume that u(Death) = 0. 
(For given any constant A, u + A is still an order-preserving utility func- 
tion, and it satisfies the EU Hypothesis.) Using u(Death) = 0, we have 


E(¢) = p-u(Death) + q-u(Life,s/N) = q-u(Life, s/N). 
Similarly, 
E(¢’) = q'-u(Life, (s — x)/N). 


Let us suppose that in fact the individual in question is indifferent between 
¢ and ¢’; that is, if he starts with assets s/N, then x/N is just at the 
borderline of the amount of money he would be willing to spend to raise 
the probability of life from q to q’. Let us suppose that q/q’” = q/q’. It 
follows that under the same starting assets s/N,x/N is exactly the 
amount he would be willing to spend to raise g” to q”’. For, 


E(¢) = E(e’) iff E(e”) = E(e’”), 
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where ¢” and ¢’” are the lotteries 


1—-q" Death 1-q"" Death 


Life, s/N Life, (s—x)/N 


respectively. This conclusion is quite interesting. To illustrate it, let us take 


999,900 
7 = 7,000,000 ’ 
_ 999,909.999 999,910 
~ 1,000,000 ~ 1,000,000’ 
, 999,990 
~~ 1,000,000 ’ 
1 . 999,999.9999 
7 = 7,000,000 


Then g/q' = q"/q'" = 99999, and 


O 


wl. 


_ 100 
P = 7,000,000 ’ 
,_ 90 
P * 7,000,000’ 
10 
P= 7,000,000’ 


p” —~ 0. 


Thus, the conclusion says that, if it is worth x/N dollars to you to reduce 
deaths from air pollution from 100 people per million to approximately 90 
people per million, then (at the same level of starting assets) it should be 
worth exactly the same amount to reduce the figure from 10 per million to 
approximately 0 per million. (This conclusion should be evaluated in the 
light of the argument that cutting the probability of death from a particu- 
lar cause those last few steps down to a very small amount is sometimes 
proportionately more expensive. Of course, the conclusion is based on 
some very strong assumptions.) 

(A similar analysis applies to an individual’s preferences among alterna- 
tive treatments for a disease. The question here is: If treatment a costs x 
dollars and has a probability g of success and treatment b costs x’ dollars 
and has a probability q’ of success, which treatment is preferred?) 
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So far our analysis has been from the point of view of an individual. Let 
us now make the analysis from the point of view of society. This situation 
is more difficult. We shall treat the set of consequences as multidimen- 
sional alternatives, as in Chapter 5. (More on multidimensional alterna- 
tives in the next section.) K will be A, X A, where A, and A, are both the 
set of nonnegative real numbers. A pair (n, a) represents a state of society 
consisting of n individuals with total assets a. Let > be society’s relation 
of (strict) preference on K. To answer the public health question, it will 
suffice to calculate an order-preserving utility function u over (K, >). For, 
suppose our current assets are s and our current population is N. If we do 
nothing, then after one year our population will be gN + aN — BN, where 
a is the birth rate and £ is the death rate due to all causes other than air 
pollution.* Our assets will be s. If we spend x dollars, our population will 
become q’N + aN — BN and our assets s — x. Thus, whether we spend x 
dollars or not depends on which we prefer, 


(qN + aN — BN,s) or (q'N + aN — BN,s — x), 
or equivalently on which is bigger, 
u(qN + aN — BN,s) or u(q'N + aN — BN,s — x). 


Of course, if calculation of the utility function u is based entirely on the 
relation >, as it was in Chapter 5, then we cannot know u unless we know 
>, and since > is what we need to know to answer the public health 
question, we do not gain anything by going to utility functions. For- 
tunately, there are techniques, such as those using Theorem 7.1, for 
calculating an order-preserving utility function u over (K, >) even without 
much knowledge about >. 

The preference relation > and the utility function will differ from 
society to society, as different societies have different points of view about 
human life, assets, etc. Thus we shall have to make some assumptions 
about a hypothetical society in order to go further in answering the public 
health question, and in particular in applying the result of Theorem 7.1. 
We shall consider two different societies. 

Let us assume for the sake of discussion that, as before, societal wealth 
in our first society is spread equally among the population. Moreover, let 
us assume that the society has determined a certain level of assets C for an 
individual which is considered the comfort level—at level C he has all he 
can possibly need. Moreover, in our hypothetical society, let us suppose 
that too much wealth for an individual is also considered bad—it leads to 
decadence, decay, etc. Thus, for a given level a of the society’s assets, there 
will be an optimal population n, namely that 7 so that a/n = C. 


*We disregard the changes in a and B due to changing q. 
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For this hypothetical society, let us use the method of calculating a 
utility function described in Theorem 7.1. Let ag be an arbitrary number, 
and let a,,m, be numbers such that a,/n, = C. Pick c# = (0, a9) and 
c* = (n,, a,). If c is any other consequence, then by our assumptions, 
c* > c. Moreover, it is reasonable to assume that c > cs: anything is 
better than having the entire population annihilated. Moreover, c* >cx. 
Assuming that continuity and the other hypotheses of Theorem 7.1 hold, 
we now measure utility of (7, a) as that number a(n, a) so that we are 
indifferent between attaining (n, a) with certainty and the lottery 


(n,a) 


1—7(n, a) 


In the lottery 0, we have a 1 — a(n, a) chance of total annihilation (say by 
nuclear war) and a a(n, a) chance of reaching the ideal spread of re- 
sources. Society should be as happy with this risk as with a guarantee of 
(n, a). Of course this method of calculating z(n, a) is a bit unrealistic, but 
perhaps it will be helpful in thinking about the utility function a(n, a). The 
properties of z(n, a) for different societies are by no means well thought 
out. Once z(n, a) has been calculated, it can be used to choose between 


(qN + aN — BN,s) and (q’'N + aN — BN,s — x), 


and so to solve the public health problem. 

Let us compare an alternative society, which in choosing between two 
alternatives (n, a) and (n, b), prefers that situation in which average assets 
per person are highest. Thus, for this society, if n, m # 0, 


(n, a)>(m, b) se a/n > b/m. 


We assume that (0, a) is the worst possible situation, all a. Now there is no 
“best” consequence c*, so Theorem 7.1 does not apply. However, it is easy 
to calculate a utility function u over (K, >). Namely, we take 


iia) = ae if n#0 
, 0 if n=0. 
In this case, assuming gN + aN — BN #0 and q'N + aN — BN ¥0, 
Ss 1 KY 


WAN ONS BNA) GN BN ON Ga 8 


1 — 
u(q/N + aN ~ BN, s ~ x) = 7. 
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Thus, it is “worth” spending x dollars to reduce from p to p’ if and only if 


S-xXx s Ss 
Gqt+a-B° qta-B 


Since x > 0 and q’ > q, this is never true! This society never spends the 
money. Spending the money would only lead to more people and fewer 
assets. We return to the public health problem in the next section. For 
more on this problem, see, for example, Schelling [1968]. 


Exercises 


1. Suppose two alternative medical treatments are being considered. 
Treatment x leads to complete cure with probability .1, paralysis of a leg 
with probability .4, and loss of a leg with probability .5. Treatment y leads 
to these outcomes with respective probabilities .2, .1, and .7. The patient is 
indifferent between losing a leg with certainty and a 50-50 chance between 
complete cure and death. He is also indifferent between having a paralyzed 
leg with certainty and an 80-20 chance of complete cure or death. 
According to the EU Hypothesis, which treatment would he prefer? 


2. Bernoulli was led to consider a logarithmic function for u($7) in part 
because of the so-called St. Petersburg Paradox. This paradox arises in 
the following game. A coin is tossed. If a head appears on the first trial, the 
gambler wins $2. If not, the coin is tossed again. If a head appears on the 
next toss, the gambler wins $4. In general, if a head appears for the first 
time on the nth toss, the gambler wins $2”. How much would you be 
willing to pay to play this game? Compare your answer to your expected 
winnings in dollars. This should suggest that the utility function for money 
is not linear in the dollar amount. 


3. A candidate running for President is entered in a primary election 
with three other candidates. He figures that if he takes a strong stand on 
abortion, he will either end up first among the four candidates, or last, with 
the two events being equally likely. If he says nothing, he is assured of 
either second or third place, the former with probability 3/4. How might 
the candidate use the method of z-brlt’s to determine whether or not to 
take the stand? 


4. A student has applied to four colleges, and has been admitted to his 
third- and fourth-choice schools. He has to let his third-choice school 
know immediately whether he will come, but has not yet heard from his 
top two choice schools. How might the student decide whether to accept 
his third choice? 


5. Is the method of z-brlt’s relevant to the investment decision of Exer. 
6 of Section 7.1? 


6. Consider the meaningfulness of the statement 7($1000) < 27($500) 
of Section 7.2.2. 


7. (Raiffa [1968, p. 89]) Let ¢ be a lottery with dollar amounts as 
consequences. Let b(f) be the price you would be willing to pay to enter 
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the lottery (, s(£) the price for which you would sell your ticket to enter (. 

(a) It is not necessarily the case that b(f) = s(f). For you are 
indifferent between having s() for certain and having the lottery ?. You 
are also indifferent between having nothing and entering a lottery (’ ob- 
tained from by subtracting b(f) from each consequence in f. Thus, 
u(s) = E(¢), u(0) = E(¢’). Can you think of lotteries ¢ and utility functions 
u($n) where b(?) ¥ s(€)? 

(b) Show that if u($n) = n, all n, then b(£) always equals s(¢). 

(c) Show that if u($n) = 1 — e~™", where A ¥ 0 is a measure of risk 
aversion, then b(f) always equals s(¢). 


8. Suppose c>d >e, (L, R) extends (K, >), and the EU Hypothesis 
holds. Show that there is no p € [0, 1] so that you are indifferent between 


RM(c) and 


9. Suppose K = {1, 2,...,”}, > on K is >, L consists of all lotteries 
with consequences in K, and (L, R) extends (K, >). Show that no matter 
what the function u on K is, if R is defined on L from u and the EU Rule, 
then (K, >, L, R) satisfies continuity. 

10. (a) If (K, L, R) satisfies the EU Hypothesis, does it follow that 
(L, R) is a strict weak order? 
(b) What about (K, >)? 


11. Suppose (ZL, R) extends (K, >), u is a value function on K, and 
(K, L, R) satisfies the following modified version of the EV Hypothesis: 


CRO’ <> Xp,u(c;) > Xq,u(d,) + 4, 


where 6 is a fixed positive number and ¢ and (’ are 


(a) Show that (LZ, R) is a semiorder. 
(b) Is (K, >)? 
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7.3 Multidimensional Alternatives 
7.3.1 Reducing Number of Dimensions 


In this section, we consider choices among lotteries with multidimen- 
sional consequences, such as we encountered in the public health problem 
in Section 7.2.6. Thus, the set K of consequences will have a product 
structure K = A, X A, X ...A,. See Farquhar [1977] and Keeney and 
Raiffa [1976] for recent surveys of the literature of this subject. Keeney 
and Raiffa also have an excellent summary of the growing number of 
applications of utility functions computed using lotteries with multidimen- 
sional consequences. Included in the applications are work on preference 
tradeoffs among instructional programs (Roche [1971]); decisionmaking 
concerning sulfur dioxide emissions in New York City (Ellis [1970], Ellis 
and Keeney [1972]); analysis of response times by emergency services such 
as fire departments (Keeney [1973b]); study of safety of landing aircraft 
(Yntema and Klem [1965]); analysis of sewage sludge disposal in Boston 
(Horgan [1972]); assessment of the risks in transport of hazardous sub- 
stances (Kalelkar, Partridge, and Brooks [1974]); understanding options in 
control of the spruce budworm in Canadian forests (Bell [1975]); and 
developing alternatives for expansion of Mexico City’s airport (de Neuf- 
ville and Keeney [1972], Keeney [1973a]). 

We shall begin by continuing the discussion of Section 5.2 on how to 
reduce the number of dimensions. We return to the medical decisionmak- 
ing problem discussed in Section 5.2. We considered the following dimen- 
sions for evaluating the result of a treatment: 


a, = amount of money spent for treatment, drugs, etc., 
a, = number of days in bed with a high index of discomfort, 
a, = number of days in bed with a medium index of discomfort, 


a, = number of days in bed with a low index of discomfort, 
1 occurs, 

a, = if complication A 
0 does not occur, 


1 occurs, 
ag = if complication B 
0 does not occur, 


1 occurs, 
a, = if complication C 
0 does not occur. 
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In Section 5.2, we showed how one might find some number a,” so that 
(a), 22, 43, Ay, As, Ag, Az) Was judged indifferent to (0, 0, a,”’, 0, a5, ag, a3). 
Let us assume that it is only possible to get one of the complications A, B, 
and C, and so (as, ag, a7) is only (0, 0, 0), (1, 0, 0), (0, 1, 0), or (0, 0, 1). Fix 
a,” and let a, = (0, 0, a,’”, 0). The vector a, represents a certain level of 
“discomfort.” Suppose that complication A is the worst possible complica- 
tion, complication B is next worst, and having no complication is of course 
the best. Thus, for fixed a,, we have 


(a, 0, 0, 0) > (a,, 0, 0, 1) > (a, 0, 1, 0) > (a, 1, 0, 0), 


where (a, x, y, Z) is short for (0, 0, a,’”, 0, x, y, z). Using c* = (a, 0, 0, 0) 
and c« = (a,, 0, 0, 1), we can calculate u(a,, 0, 1, 0) and u(a,, 0, 0, 1), by 
the method of Theorem 7.1. Specifically, using c* and cs», we seek a 
number 7, so that the constant lottery ((a,, 0, 1, 0) is judged indifferent to 


Tp (@,, 0, 0, 0) 


1-™p 
(@,, 1, 0, 0) 


the lottery with a 7, chance of “discomfort” a, and no complications and 
a (1 — 7) chance of “discomfort” a, and complication A. We also seek a 
similar number 7,. Using these numbers we can calculate u(a,, 0, 1, 0) and 
u(a,, 0, 0, 1) for all a,, knowing only u(a,, 0, 0, 0) and u(a,, 1, 0, 0). Thus, 
we can reduce calculation of u(a,, a2, a3, a4, As, Ag a7) to calculation of 
u(0, 0, a,”’’, 0, 0, 0, 0) and u(0, 0, a,”’, 0, 1, 0, 0), for all a,’”. This has es- 
sentially reduced matters to a two-dimensional problem, using dimensions 
3 and 5. For recent references on the subject of reducing the number of 
dimensions, see MacCrimmon and Siu [1974], MacCrimmon and Wehrung 
[1978], or Keeney [1971, 1972b}. 


7.3.2 Additive Utility Functions 


In Section 5.4, we considered the problem of finding an additive utility 
function, that is, an order-preserving utility function u on a product 
A, X A, X---+ XA, such that for all (a), a,,...,a,) belonging to 
A, X A,X +++ XAgs 


u(a), @,...,4,) = u,(a,) + u(a,) +--+ - +u,(a,), (7.13) 
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where u,(i = 1, 2,..., ) is a real-valued function on 4,. Let us call such a 
function u an additive order-preserving utility function. In case n = 2, Eq. 
(7.13) takes the form 


u(a, b) = u,(a) + u,(b). (7.14) 


In Section 5.4.3 we stated very general conditions on a set A, X A, 
sufficient to guarantee the existence of an order-preserving utility function 
satisfying Eq. (7.14). 

In this subsection, we shall assume that the set K of consequences has a 
product structure A, X A. Assuming the existence of lotteries and the EU 
Hypothesis, we state rather simple assumptions which give rise to an 
order-preserving utility function u on K satisfying Eq. (7.14). Our approach 
follows that of Raiffa [1969], though the details are different. Let us say 
that the structure (K, L, R) satisfies the marginality assumption (relative to 
a» and 5.) if, for any a € A, and b € A,, the decisionmaker is indifferent 
between the following lotteries (, and ¢,: 


(a, b,) 


Ni 


(a,b) 


nl 


(7.15) 


nl 
neo 


aby) (a,b) 


In each of (, and 6, there is a 50-50 chance between a and as and a 50-50 
chance between b and bx, and so it is reasonable to be indifferent between 
them.* We say (K, L, R) satisfies the marginality assumption if it satisfies 
the marginality assumption relative to some as and bs. If there is an 
additive order-preserving utility function u on K and if u satisfies the EU 
Hypothesis, then the marginality assumption (relative to any ax and bx) 
follows easily. Indeed, the marginality assumption follows from the ex- 
istence of a value function u on K which is additive [satisfies Eq. (7.14)] 
and satisfies the EV Hypothesis. We shall show that the marginahty 
assumption is essentially sufficient for an additive value function. Suppose 
the value function u:K — Re satisfies the EV Hypothesis. Then by the 


*This assumption is sometimes called marginal independence ot value independence. The 
reason for the term “marginality” comes from probability theory: each lottery has the same 
marginal probability distributions for a outcomes and b outcomes, though the joint distribu- 
tions differ. 
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marginality assumption, 
3u(a, b) + fu(as, bs) = Zula, bs) + 4u(as, b), 
for all a € A, and b € A,. Thus, 
u(a, b) + u(as, bs) = u(a, bs) + u(as, d). (7.16) 
Define u, on A; (i = 1, 2) by 
u,(a) = u(a, be) — ulas, bs), u,b) = u(as, 5). (7.17) 
By Eq. (7.16), we find immediately that 
u(a, b) = u,(a) + u,(b). (7.18) 


If (L, R) extends (K, >), it follows that u is an order-preserving utility 
function for (K, >), and hence by (7.18) an additive order-preserving 
utility function. This proof follows one of Raiffa [1969, p. 36], though the 
result is due to Fishburn [1965]. (For references, see also Fishburn [1967, 
1969a,b, 1970, Chapter 11], Fishburn and Keeney [1974].) We summarize 
the results as a theorem. 


THEOREM 7.2 (Fishburn.) Suppose K = A, X A, is a set, > is a binary 
relation on K, L is a collection of lotteries with consequences in K, and R is a 
binary relation on L. If (K, L, R) satisfies the EV Hypothesis and if the 
marginality assumption holds for (K, L, R), then any value function u on K 
satisfying the EV Hypothesis is additive. Moreover, if (L, R) extends 
(K, >), then u is an additive order-preserving utility function for (K, >). 


Remark: This theorem generalizes easily to the situation where we have 
K= A,X A,X--- XA,, n > 2. See Raiffa [1969, p. 37] or Farquhar 
[1974] for a formulation. 


Let us ask whether or not the marginality assumption is satisfied in 


various examples. In the public health problem, we can consider a clioice 
between the following lotteries, with n # 0 or a # 0: 


1 1 
2 (n, a) 2_— (0,a) 


Ni 
nN) 


(0,0) (n, 0) 
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Certainly (,; would usually be preferred; in £, both consequences are as bad 
as can be. This violates the marginality assumption (relative to 0 and 0). 
Thus, either the EV Hypothesis is false or the existence of an additive 
value (utility) function is false, for we have seen that the marginality 
assumption (relative to any a+ and bs) follows from these two assump- 
tions. Of course, here additivity seems to fail, at least for the second 
hypothetical society of Sec. 7.2.6. For suppose u(n, a) = u,(n) + u,(a), 
some u,, u,. We have u(n, 0) = u(m, 0), all n, m. Let a be the common 
value. Then u,(n) = a — u,(0), all n. Hence, u(n, a) = a — u(0) + u,(a) 
= u(m, a), all n, m. Note that if additivity (or the marginality assumption) 
fails, this does not mean that there is no value (utility) function. It only 
means that value (utility) cannot be measured by separating out compo- 
nents and adding. 

If the set K of consequences consists of market baskets, the marginality 
assumption might be reasonable. However, this is not reasonable if there is 
an interaction between components. If the first component is eggs and the 
second salt, you might consider 2 eggs and no salt very bad, and also no 
eggs and 1/2 a teaspoon of salt. But you would be happy with 2 eggs and 
1/2 a teaspoon of salt. Thus, you might prefer f, to @, in (7.15). Naturally, 
if there is such an interaction, we do not expect an additive order-preserv- 
ing utility function u. 


7.3.3 Quasi-additive Utility Functions* 


Since the marginality assumption might be too strong, let us consider 
some weaker conditions. 

Let us recall that one of the necessary conditions on (A, X A,, >) for 
the existence of an additive order-preserving utility function (Section 5.4.3) 
was independence: for all a, a’ € A, and bg, by © A2, 


(a, by) >(a’, by) <= (a, bo’) >(a’, bo’), 
and for all ao, ag’ © A, and b, b’ € Ap, 
(a, 6) > (ag, 5’) <> (ao’, b) >(ag’, 5’). 


Suppose again that there is an additive value function u which satisfies the 
EV Hypothesis for (A, X A2, L, R). Such a function u is an additive 
order-preserving utility function for (A, X Aj, >) if (LZ, R) extends 
(K, >). If u exists, a stronger condition to be called strong independence 
holds. We say that (A, X A,, >, L, R) satisfies strong independence! (on 


*Although we follow Raiffa [1969], much of the material in this section is based on work 
of Keeney [1968a,b]. For more recent references, see Keeney [1972b, 1973c] and Farquhar 
[1974, 1977]. 

‘This assumption is sometimes called utility independence. 
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the first component) if, for all bp © A, whenever ¢ and 0’ are lotteries all of 
whose consequences have the form (a, bo), then preferences for ¢ versus 0’ 
do not change if the common value by is changed in every consequence to 
the same 5,’ in A, and the probabilities p(a, bg) and p(a, by’) are the same. 
A similar definition applies to the second component, and we say that 
strong independence holds if strong independence holds on both compo- 
nents. To give an example, we observe that strong independence says that 
one prefers lottery (; to lottery ¢/ if and only if one prefers lottery £, to 
lottery £, where the lotteries (,, ¢], 6, and & are given by 


P, (a,, 5,) qa (a,,b,) 
Q, Ps (a,,b,) Z Q' ’ 
Ds q2 
(a;,5,) (a,,b,) 
Ps: (a, 0) q, —@4b0) 
R, 2 (a2, by) ? Q, 
D; q2 
(a,,b,) (a,,b,) 


The verification that strong independence follows from the existence of an 
additive value function u on A, X A, satisfying the EV Hypothesis is left 
to the reader. Incidentally, trivially, strong independence implies indepen- 
dence, if we assume that (LZ, R) extends (K, >). 

Under the EV Hypothesis, the condition of strong independence does 
not imply that there is an additive value function u on A, X A, satisfying 
the EV Hypothesis or that there is an additive order-preserving utility 
function u on (A, X A, >). However, under certain simple assumptions, 
we shall conclude the existence of a u that is almost additive. Let us say a 
real-valued function u on A, X A, is quasi-additive (bilinear, multilinear) if 
there are real-valued functions u, on A; (i = 1, 2) and a real number A so 
that for all (a, b) € A, X A3, 


u(a, b) = u,(a) + un(b) + Au,(a)u,(d). (7.19) 


The third term on the right-hand side represents an interaction effect. We 
have encountered a variant (a generalization) of this representation, which 
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we also called quasi-additive, in Section 5.5.2. We shall see that under 
certain additional assumptions, strong independence implies the existence 
of a quasi-additive value or utility function. 

Nonadditive representations for utility, in particular the quasi-additive 
representation, have had a wide variety of applications, for example, to 
transportation (Keeney {1973a], de Neufville and Keeney [1972, 1973], to 
medical decisionmaking (Keeney [1972a]), to management (Huber [1974], 
Ting {1971}), to solid waste disposal (Collins {1974]), to air pollution (Ellis 
{1970]), and to urban services (Keeney {1973b]). In applying the quasi- 
additive representation to evaluating inventory policies of hospital blood 
banks, Keeney [1972a] identified the following two dimensions: A, = 
shortage of blood requested by doctors but not readily available in the 
blood bank (measured in percent of units demanded); A, = outdating of 
blood caused by exceeding its legal lifetime (measured in percent of total 
units stocked in a year). Keeney verified that his decisionmaker’s prefer- 
ences satisfied strong independence on A, X A,. Her preferences for 
lotteries involving fixed by € A, did not depend on the level of bo, and 
similarly for A,. Keeney derived the following quasi-additive utility func- 
tion: 


u(a, b) = 0.32(1 — e8*) + 0.57(1 — e4) + 0.1101 — e#)(1 — e). 


Many kinds of nonadditive representations other than quasi-additivity 
might be useful. Recently, Farquhar [1974, 1975, 1976] has developed 
techniques for deriving sufficient conditions for a wide variety of such 
representations. We concentrate on the quasi-additive representation here. 

We say that the component A, is bounded if there are ax, a* € A,, such 
that for all a € A), 


a* > a> as, 
1 1 
where 
a z,0 <> (3b € A,)[(a, b) > (a’, 5) ]. 
A similar definition applies on the second component. If strong indepen- 
dence holds, indeed even if only independence holds, and (L, R) extends 
(K, >), then 


a z4 <= (Wb € A,)[ (a, 5) > (a’, b)], 


and similarly for ca 
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THEOREM 7.3. Suppose (A, X A>, >, L, R) satisfies the following condi- 
tions: 

(a) The EV Hypothesis. 

(6) Strong independence. 

(c) Each component A, is bounded. 

(da) Continuity*. 
Then there is a quasi-additive value function u on A, X A, which satisfies the 
EV Hypothesis. If in addition (L, R) extends (K, >), then u is a quasi- 
additive order-preserving utility function for (A, X A>, >). 


Proof. Let u be a value function on A, X A, satisfying the EV Hypothe- 
sis. Let ax and a* be bounds for A,, and let bs and b* be bounds for A,; 
that is, assume 


a* ma Z ae alla€ A, 
and 

b* z,b Ze all b € A). 
We may assume that u(as, b+) = 0. For if u is a value function satisfying 
the EV Hypothesis, so is u + A, any real constant A. By continuity, we can 


define a function v,(a) on A, such that (a, 5), the lottery that gives (a, 5) 
with certainty’, is indifferent to the lottery 


(a,,5) 


This lottery can be thought of as a v,(a)-brit (Section 7.2.3). By strong 
independence, v, is a function of a alone. Similarly, we can define a 
function v,(b) on A,, which is a function of 6b alone, such that ((a, 5) is 


*See Section 7.2.2 for definition. 
tWe are assuming throughout this chapter that all such certain lotteries belong to the set L. 
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indifferent to the lottery 


v,(b) (a,b*) 


1-v, (db) 


(a,b,) 


Let u(a*, be) = a,, u(as, b*) = a,, and u(a*, b*) = a3. Recall that 
u(ax, bs) = 0. Now for all a, b, by the EV Hypothesis, 


u(a, b) = v,(a)u(a*, b) +[1 — v,(a) ]u(as, b) 

v,(a){ v,(b)u(a*, b*) +[1 — 0,(b)]u(a*, bs)} 

+[1— 0,(a)]{ 0,(b)u(as, b*) +[1 — 0,(b)] u(as, b«)} 
= v,(a){v,(b)a, + [1 — 0,(b)]a,} +[1 — 0,(a)]0,(b)a, 
= a,0,(a) + a,v,(b) + (a3 — a, — @)v,(a)v,(b). 


Thus, letting u,(a) = a,v,(a) and u,(b) = a,v,(b), one obtains a quasi- 
additive representation for u. | 


Remark: Under the EV Hypothesis, strong independence is a necessary 
condition for quasi-additivity. The proof is left to the reader. 


COROLLARY (Keeney and Raiffa [1976]). Under the hypotheses of The- 
orem 7.3, either there is an additive value function u on A, X A, which 
Satisfies the EV Hypothesis, or there is a multiplicative value function u on 
A, X A, which satisfies the EV Hypothesis. 


Proof. Suppose u is a value function on A, X A, which satisfies the EV 
Hypothesis and also Eq. (7.19). If A = 0, then (7.19) reduces to an additive 
representation for uv. Thus, suppose A ~ 0. Suppose first that A > 0. Then 
clearly 


v(a, b) = Au(a, 6) + 1 


is again a value function on A, X A, which satisfies the EV Hypothesis. 
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Moreover, we have 


v(a, b) = dAu,(a) + Au,(b) + A7u,(a)u,(b) + 1 
=[Au,(a) + 1][Au(b) + 1] 


= v,(a)v,(d), 
where v(x) = Au,(x) + 1. Thus, v is multiplicative. If A < 0, let 


v(a, b) = —Au(a, b) — 1, 
and use 
v,(a) = —du,(a) - 1 
and 


0,(b) = du,(b) + 1. r 


Let us ask if the axioms of Theorem 7.3 seem reasonable. In both market 
baskets and the public health problem, continuity seems reasonable. Let us 
accept the EV Hypothesis. There is a lower bound as for each component 
in the market baskets example, namely 0. We could create an upper bound 
a* for each component by using a reasonable upper limit to consumption, 
say 100 times the GNP for safety’s sake. In the public health problem, 
suppose average assets are used for the utility function. For fixed a # 0, if 
n, m ¥ 0, (n, a)>(m, a) if and only if n <m. Thus | could serve as an 
upper bound (best alternative) on the population dimension. The lower 
bound would be 0. On the assets dimension, 0 would be a lower bound. 
The maximum conceivable asset level could serve as an upper bound. 
Suppose next that we are in the maximum comfort level society. Then zm, 
and =, are strange relations. For every n and m, there is a such that 
(n, a) > (m, a). Namely, choose a such that a/n = C and a/m¥C, 
Thus, n > m. It follows that every n qualifies as an upper bound for the 
first component. But of course this is not what we had in mind. What is 
happening is that, as we shall see, independence is violated, in which case 
=, is not an interesting relation. The same is true for ma, 

Let us consider next strong independence. For an extensive discussion of 
empirical tests of this assumption, see Keeney and Raiffa [1976]. See in 
particular the references to applications of utility functions over lotteries 
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with multidimensional consequences given at the beginning of Section 
73.1. 

Even independence might not hold for market baskets, as we observed 
in Section 5.4.3. For suppose the first component is coffee, and the second 
sugar. You might prefer (1,1) to (0,1), but (0,0) to (1,0), having a violent 
dislike for coffee without sugar and so having to find a place to dispose of 
it, 

Independence seems to hold in the public health problem for the society 
that uses average assets to measure utility, if we do not allow assets to be 0. 
For this society, if n, m > 0, 


(n,a)>(n,b) = a/n>b/n 
=> a/m>b/m 
= (m, a)>(m, b). 
Similarly, if m, n > 0 and b > 0, then (n, a)>(m, a) => (n, b)>(m, B). 
However, this fails if b = 0. Strong independence also seems basically to 


be satisfied for this society, at least if we do not allow population or assets 
to be 0. For consider lotteries 


(n,,x) (m,,x) 


(n,,x) (m,,x) 


(n,,x) (m,,x) 


If the EV Hypothesis holds, then if all n,, m, x, and x’ are positive, 
Rly © Epy-(&/n) > Bqy-(x/m) 
> x3 (p,/m) > x2(q/m) 
< x2 (p,/m) > xB (q/m) 
> Ep,-(x'/n) > Bq: (x'/m) 
= (R63, 


where ¢/ is obtained from @, by substituting x’ for x. A similar argument 
works on the second component. 
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Turning finally to the public health problem in the maximum comfort 
level society, we see that, as suggested earlier, even independence is 
violated here. Specifically, 


(n, a) > (n, b) = (m, a) > (im, b) 


fails. For given population level n, there is an optimal total wealth nC. 
Suppose m # n. Then 


(n, nC) > (n, mC), 
but 
(m, mC) > (m, nC). 


Strong independence is violated here in an interesting way. Consider the 
two lotteries 


1/8 (n,a) 


g, 7/8 L, ——————_{n, b) 


(n, a’) 


Suppose a is small, a’ is large, and b is moderate in size. Suppose a’/n is 
close to the ideal comfort level. If a/n is above a level of subsistence, it 
might be worth a 7/8 chance of reaching ideal comfort level, rather than 
taking a moderate level b/n for certain. Thus, ¢, might be preferred to (,. 
However, if a/n is below the subsistence level, , might be preferred to 0,. 
Thus, by changing n, we might change preference between these lotteries. 
The choice between (, and f, could arise for a country that is considering a 
radical new trade policy or foreign policy. 

To close this section, we prove a theorem that tells when a quasi-additive 
representation can be made into an additive representation. We use the 
notation a, ~;a, to stand for a, > a, and a, za 

Suppose that a,, a, € A;, b,, b) € A, and not (a, ~,a,) and not 
(b, ~2b). Suppose we are indifferent between the lotteries 


} (a,,5,) \ (a,,5,) 


and 


NH 
Ni 


(a,,b,) (a,,b,) 
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In this case, we shall say that the marginality assumption holds for 
4, @y, b,, bp. 


THEOREM 7.4 (Raiffa [1969]). Given (A, X A,, >, L, R), suppose there is 
a quasi-additive value function u on K satisfying the EV Hypothesis and 
suppose (L, R) extends (K, >). Suppose also that for some a,, a, © A, and 
b,, b, © A, such that not a, ~,a, and not b, ~ b,, the marginality assump- 
tion holds for a,, ay, b,, b,. Then u is additive. 


Proof. By the marginality assumption for a,,a,, b,,b, and the EV 
Hypothesis, we have 


tu(a,, b,) + 4u(a,, b;) =fula,, b,) + 4u(a, 5). 

Canceling 4 and using quasi-additivity (7.19), we obtain 

u,(a,) + u,(b,) + u,(a,) + u,(b,) + Au,(a,)u,(b,) + Au,(a,)u,(b,) 

= u,(a2) + un(b,) + u,(a,) + u2(b2) + Auj(az)u2(b,) + Auj(a,)u(b,). 
It follows that 

Au,(az)u(b2) + Au,(a,)u2(b,) = Au, (az)u,(b,) + Auy(a,)up(b2). 
Then we have 
A[4,(4,) — (a2) ][u2(b,) — u2(b,)] = 0. 

But not a, ~,a, implies that u,(a,) — u,(a,) # 0, and not b, ~,b, implies 
that u,(b,) — u,(b,) #0. (The proof, which uses the fact that (L; R) 


extends (K, >), is left to the reader.) We conclude that A} = 0, and so 
additivity follows. a 


Exercises 
1. Let (a, b) represent consumption in two time periods. Consider the 
following lotteries: 


1 $50,000, $80,000 
($50,000, $50,000) 2 : ; 


Nl 


vl 
2 
i tod 


80,0 8 
ea Cees AOL) ($80,000, $50,000) 
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If an individual prefers 0’ to ?, conclude that there can be for him no 
additive value function over K satisfying the EV Hypothesis. 


2. (Farquhar [1974, p. 16]) Consider a two-element system with identi- 
cal components operating in parallel, for example, such human organ 
systems as eyes, lungs, limbs, or kidneys. Let A, be an index of relative 
performance of one component, say the left kidney, and A, an index of 
performance of the second component, say the right kidney. Is it reason- 
able to argue that preferences for removal or nonremoval of either compo- 
nent (kidney) do not satisfy strong independence? 

3. (Farquhar [1974, p. 16]) In the study of painkillers, suppose A, is the 
amount of pain endured during a given time period, and A, measures the 
distribution of pain. Is it reasonable to argue that strong independence 
holds for one component, but not for the other? 

4. Fishburn and Keeney [1975] study the case where A, is the number 
of days until death and A, is the quality of a patient’s life. 

(a) Consider whether marginality holds. 
(b) Consider whether strong independence holds. 

5. Show that if there is an additive value function u satisfying the EV 
Hypothesis, then strong independence holds. 

6. Show that if there is a quasi-additive value function satisfying the 
EV Hypothesis, then strong independence holds. 

7. Show that under the hypotheses of Theorem 7.4, not a, ~, a, 
implies u,(a,) — u,(a,) # 0, and not b, ~,b, implies u,(b,) — u,(b,) # 0. 

8. Show that the proof of Theorem 7.3 goes through if we replace the 
assumption of strong independence by the following weaker assumption 
and its analogue for the second component: if ((a, b)J0’, where 0’ is as 
follows, then ((a, b’)I¢” for all b’, where J is the tying relation defined by 


kIk’ <= ~ kRk’ & ~ k’Rk 


and ¢” is as follows: 


(a,,b) (a,,') 


(/ is the relation we have usually denoted E; however, in this chapter it 
will be useful to reserve the letter E for another purpose.) 


9. Suppose A, = A, = {1,2,..., 2} and 


(4), a2) > (b,, bz) = u(a,, a2) > u(b;, bp). 
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Suppose (L, R) extends (K, >), and v is a value function satisfying the EV 
Hypothesis. Suppose u(a,, a,) = a,a,. Show the following: 

(a) Strong independence holds. 

(b) Continuity holds. 

(c) Each component is bounded. 

(d) There is a quasi-additive representation. 

(e) The quasi-additive representation given by the proof of Theorem 
7.3 uses 


a-l b-1 
(a) =" — > 0b) = > 
n n 
u(a)=7~—j (4-1), ula) => (6 — 0), 
and 
n?—2n 
er 


(f) The marginality assumption fails. 
(g) There is no additive representation. 


10. Consider the assertions of parts (a), (b), (c), (d), (f), and (g) of Exer. 
9 for the following functions wu: 
(a) u(a,, 42) = ay + ajay. 
(b) u(a,, a2) = max{a,, ap}. 
(c) u(a,, a2) = |a, — a). 
(d) u(a,, a.) = 1/a,ap. 
(€) u(a,, a) = a,/(a, + a3). 
(f) u(a,, a.) = afay/(a, + ay). 
(g) u(ay, a3) = f (a2) + g(a)h(a,). 
(h) u(a,, a2) = af (ay) + Bear) + rf (a,)g(a2). 
(i) u(a,, a) = [a + Bf (ally + dg(a,)) 
(j) u(a,, a3) = af (a,) + Ba(a)). 
11. Repeat Exer. 9 if A, = A, = Re. 
12. Repeat Exer. 10 if A, = A, = Re. 


13. (Keeney [1971], Keeney and Raiffa [1976, pp. 226, 243]) Suppose 
strong independence holds on the second component. Identify what hy- 
potheses are needed to show that there is a value function u on A, X A, 
satisfying the EV Hypothesis and such that 


u(a, b) = u,(a) + ui(a)u,(b). 


14. (Keeney [1974], Keeney and Raiffa [1976]) Given the structure 
(A, X A, X +--+ XA,, >, L, R), one generalization of the quasi-additive 
representation is that there is a value function uw on A, X A, X-+:: XA, 
that satisfies the EV Hypothesis, and there are real-valued functions u, on 
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A,,i = 1,2,...,a, and there is a constant A, such that 


u(a), @,...,4,) = = uj(a;) +A E u(a)u(q) 


j>i 
+0? = u;(a;)u,(a,)u,(a;) (7.20) 
j>i 
I>j 
+--+ +A" !u(a,)u,(a,) ... u,(a,)- 


One hypothesis required for the representation (7.20) is the variant of 
strong independence which says that if consequences in all components 
but the ith are held fixed, then preferences among lotteries do not depend 
on the level of the fixed components. Formalize this and other hypotheses 
that allow the derivation of the representation (7.20). Hint: Repeatedly use 


u(a), a,,...,4,) = uj(a;) + ca;)w,(a), ay, . ~~, O;—4) Aj yy s+ +> Aq) 


15. (Keeney and Raiffa [1976]) Show that if the hypotheses you in- 
troduced for the representation (7.20) in the previous exercise are satisfied, 
then there is a value function on A, X A, X --- XA, Satisfying either an 
additive or a multiplicative representation. 

16. (Keeney and Raiffa [1976]) A variant of the representation (7.20) is 
the following representation for a value function u satisfying the EV 
Hypothesis, where the A,,__ are constants. 


W(ay, ay.) = B Aula) + BZ Aym(aduy(a) 
+ EEE du(ady(a aa) 
Hs #Xj23... ni(ay)ua(a2) .. . uy(a,). 
One hypothesis required for this representation is that if consequences in 
all components in any set X of components are held fixed, then prefer- 
ences among lotteries do not depend on the level of the fixed components. 


Formalize this and other hypotheses which allow the derivation of such a 
representation. Hint: Use 


u(a), a,..., a,) = v(q;,, yy esos a, )+c(a;, Gy. +s a, )w(a;,, Gaia a,), 


where 


ipa ceog dhe and “(Fyudisae< 555} 


form a partition of {1, 2,...,}. Let X be any subset of {1,2,..., 7}. 
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Show that 
u(a, a,..., 4, =uj(a,u(a;, a3,...,4,)+[1 — ua) ]u(ay, az,..., 47), 
where 
,_ 4 if i<X 
a \a* if ie x, 
7 a, if 1€X 
7 las if TEX, 


and the a* determine a maximum element using the components of X, and 
the a,x determine a minimum element using the components of X. 


7.4 Mixture Spaces 


In this section, we shall seek a representation theorem for the Expected 
Utility (Value) Hypothesis. That is, given a preference relation R on a set 
L of lotteries, we shall ask for conditions on (L, R) sufficient to guarantee 
the existence of a function E on L satisfying, for all 0, ¢’ € L, 


ORe’ <> E(t) > E(t’). (7.21) 


Of course, we already know conditions on (L, R) sufficient for the ex- 
istence of such an E: (L, R) must be a strict weak order, and (L*, R*) 
must have a countable order-dense subset. We really want E to have more 
properties than (7.21): we want it to act like an expected utility. A key 
property of expected utility is the following. Suppose ¢ and ¢’ are lotteries, 
p is areal number in {0, 1], and ¢” is the complex lottery 


p Q 
Q” l-p 
Q’ 
Then 
E(¢”) = pE(¢) + (1 — p)£E(€’). (7.22) 


We shall seek conditions on (L, R) sufficient to guarantee the existence of 
a function E on L satisfying Eqs. (7.21) and (7.22). If we can find such a 
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function E, and if (L, R) extends (K, >), then we can define u:K — Re by 
u(c) = E[ &(c)]. 


It is easy to see that u is an order-preserving utility function over (K, >) 
and E£ is expected utility relative to u. The problem we study has also been 
considered by, among others, von Neumann and Morgenstern [1944], 
Friedman and Savage [1948, 1952], Marschak [1950], Herstein and Milnor 
[1953], Cramer [1956], Luce and Raiffa [1957], Blackwell and Girshick 
[1954], and Huang [1971]. Our approach follows that of Fishburn [1970, 
Chapter 8], Luce and Suppes [1965], and Fishburn and Roberts [1978]. 

The conditions we shall present hold in a more general setting than that 
of lotteries. Suppose A is a set of alternatives and R is a binary relation of 
(strict) preference on A. For every real number p in [0, 1] and for every a, b 
in A, we shall speak of the mixture apb. This is a new alternative, an 
element of A, which is interpreted as a lottery with probability p of 
obtaining a and probability 1 — p of obtaining b. It will be convenient to 
think of A as a collection of lotteries with consequences in a set K and apb 
as the new complex lottery 


But these interpretations are merely for motivation purposes. Formally, 
one thinks of a function @:A x [0, 1] x A > A, with O(a, p, b) = apb. We 
shall study the triple (A, R, 9), and we shall call it a mixture space. 

As Gudder [1977] points out, mixture spaces arise in a wide variety of 
contexts. One example is in the making of bread. If F is flour and B is 
butter, we can speak of the mixture FpB: this means p cups of flour for 
each | — p cups of butter. In color vision, we can mix colored lights by 
superposition of light beams. If a and b are two colored lights and 
p €[0, 1], then apb is the light obtained by superposing a and 5 in the 
proportion p:1 — p. In quantum mechanics, if a and b are states of a 
system and p € [0, 1], apb represents the state in which the system is in 
state a with probability p and in state b with probability 1 — p. (Axiomatic 
developments of quantum mechanics using ideas like mixtures can be 
found in Gudder [1973] and Mielnik [1969].) 
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Let us return to the mixture space (A, R, @) in the special case where A 
is a collection of lotteries and the EU Hypothesis holds. Then there is a 
real-valued function E on A, the expected utility, such that for all a, b € A 
and p € (0, 1], 


aRb © E(a) > E(b) (7.23) 


and 


E(apb) = pE(a) + (1 — p)E(8). (7.24) 


Speaking abstractly again, we now seek conditions on the triple 
(A, R, 6) sufficient to guarantee the existence of a real-valued function E 
on A such that Eqs. (7.23) and (7.24) are satisfied. This abstract formula- 
tion of the problem of axiomatizing expected utility goes back to von 
Neumann and Morgenstern [1944]. They gave axioms sufficient for the 
representation (7.23), (7.24). We shall present a set of axioms based on 
some of Fishburn and Roberts [1978], which are all necessary as well as 
being jointly sufficient. These axioms are closely related to a set given in 
Fishburn [1970]. 

To present these axioms, we define the “tying” relation J from R by 


alb = ~ aRb & ~ bRa. (7.25) 
(This is the relation we have usually denoted E; however, here we use the 
notation J to avoid confusion with the function E£). A mixture space 
(A, R, 9) is defined to be an EU Mixture Space if, for all a, b,c € A, the 
following axioms are satisfied: 
Axiom M1. (A, R) is @ strict weak order. 
Axiom M2. (apb)I[b(1 — p)a], all p € [0, 1). 
Axiom M3. [(apb)qb|I[a(pq)b], all p, q € [0, 1]. 
Axiom M4. If aRb, then (apc)R(bpc), all p € (0, 1). 
Axiom MS. If aRbRc, then there are p, q © (0, 1) such that 
(apc) Rb and bR(aqc). 
Axiom M1 is a standard axiom which clearly follows from Eq. (7.23). 


Axioms M2 and M3 follow simply in the case of lotteries. For instance, to 
verify Axiom M3 here, the lottery on the left side can be replaced by the 
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tree diagram 


1—pq 


which is the lottery on the right side of Axiom M3. Axioms M2 and M3 
also follow easily from Eqs. (7.23) and (7.24), as the reader can readily 
verify. 

Axiom M4 also directly follows from the representation (7.23), (7.24). 
For if aRb, then E(a) > E(b), so 


pE(a) + (1 — p)E(c) > pE(b) + (1 — p)E(c), 
so 


E(apc) > E(bpc), 
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so 


(apc) R(bpc). 


Axiom M4 is a kind of independence condition, and it is also reminiscent 
of the monotonicity condition of extensive measurement (Section 3.2). 
Finally, to show that Axiom M5 follows from the representation (7.23), 
(7.24), suppose that aRbRc. Then 


E(a) > E(b) > E(c), 
so 
1-E(a) + 0- E(c) > E(b) > 0- E(a) + 1+ E(c). 
There must be p, g € (0, 1) so that 


pE(a) + (1 — p)E(c) > E(b) > gE(a) + (1 — q)E(c). 


THEOREM 7.5 (Fishburn and Roberts [1978])*. If (A, R, 9) is a mixture 
space, then (A, R, 9) is an EU Mixture Space if and only if there is a 
real-valued function E on A such that for all a,b € A and p € {0, 1}, 


aRb = E(a) > E(b) (7.23) 
and 


E(apb) = pE(a) + (1 — p)E(8). (7.24) 


Moreover, if E’ is another real-valued function on A satisfying (7.23) and 
(7.24), then E’ is related to E by a positive linear transformation; that is, 
there are real numbers ) > 0 and NW’ such that E’ = XE + 0’. 


We omit the proof of the sufficiency of the conditions of an EU Mixture 
Space for the representation (7.23) and (7.24), and of the uniqueness 
statement. But see Exers. 8 through 12. 

The following is an interesting corollary of Theorem 7.5. 


COROLLARY. Suppose L consists of all lotteries with consequences in a set 
K, and R is a binary relation on L. Then there is a real-valued function u on 
K satisfying the EV Hypothesis if and only if the following axioms hold: 

(a) (L, R) is a strict weak order. 


*The theorem in Fishburn and Roberts states explicitly the axiom that for all a, b © A and 
Pp © (0, 1], apb © A. This is contained in our definition of a mixture space. 


7.4 Mixture Spaces 357 


(b) If ¢,Re,, then for all p € (0, 1) and 0, © L, 0Rt’, where € and t’ are 
the following complex lotteries: 


(c) If 0,R€,R03, then there are p,q © (0, 1) such that (Rl, and (,R0’, 
where ¢ and 0’ are the following complex lotteries: 


1—p 1—q 


Proof. Necessity of conditions (a), (b), and (c) is clear. To show 
sufficiency, note that the set L forms a mixture space in the obvious way 
and that (a), (b), and (c) are Axioms M1, M4, and M5 of an EU Mixture 
Space. The other axioms of an EU Mixture Space hold trivially by 
conventions for lotteries. Thus, there is a real-valued function E on L 
satisfying Eqs. (7.23) and (7.24). Using the convention that c, and ((c,) are 
interchangeable, one easily verifies from (7.24) that if ¢ is the lottery 


then 


E() = E p.E(We)). 


The EV Hypothesis follows by taking u(c,) = E(@(c,)). a 
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The axioms for an EU Mixture Space seem intuitively pleasing. In 
particular, Axioms M2 and M3 seem very plausible. However, we have 
previously mentioned evidence against Axiom M1. As another argument 
against Axiom M1, consider the following lotteries: 


$100 


It is possible that 0’J@ and ¢”J0’, while @’R¢. Thus, indifference is not 
transitive, and Axiom M1 is violated.* t The lotteries in the Allais Paradox 
(Section 7.2.5) can be used to make an argument against Axiom M4 (see 
Exer. 6). It is also easy to make arguments against Axiom M5. The 
following example is due to Gudder [1977]. Suppo.e a =“be given two 
candy bars,” b =“be given one candy bar,” and c = “be hanged at dawn.” 
Then aRbRc, but in the lottery interpretation, there is no p < 1 such that 
(apc) Rb. For more serious examples, see Exers. 4 and 5 below. 

The reader might wish to think about whether Axioms M1 through M5 
hold in the various other interpretations of mixture, given above (bread 
and flour, colored lights, quantum mechanics). See Gudder [1977] for a 
discussion of the variants of these axioms obtained by replacing J by =. 

An interesting consequence of Axioms M1 through M4 has been sub- 
jected to experimental test. If we believe Axioms M1 through M3, we can 
regard this as a test of Axiom M4. The consequence in question is the 
following condition: 


If aRb, then for all p € (0, 1), aR(apb) Rb. (7.26) 


*This argument is due to Fishburn [1970]. 


tA similar argument shows that the preference relation R may not even be a semiorder or 
an interval order (Chapter 6). For if ¢’” is the lottery 


gv—___| $35.50 , 


then we might have (’R(”’ R¢, while ~ (’R?” and ~ ¢”R¢ (Fishburn [1970)). 
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Coombs and Huang [1976] have subjected condition (7.26) to experimental 
test by providing subjects choices of gambles. In both of their experiments, 
there were, surprisingly, a large number of violations of this condition. 

It is interesting to see how condition (7.26) follows from the axioms. The 
proof should provide a little feeling for how one reasons about mixture 
spaces. We first prove the following lemmas. 


LEMMA 1. Jn an EU Mixture Space, (a1b)Ia. 


Proof. Suppose first that (a1b)Ra. Then by Axiom M4, 


[ (a1b)3b]R(a38). 
But by Axiom M3, 

[(a1b)35 | 1(a35). 
This contradicts the definition of J. A similar proof shows that aR(a1b) is 
impossible. B 


It should be noted that in the mixture space axioms given by Fishburn 
[1970], the result al1b = a@ is an axiom, and Axioms M2 and M3 are 
replaced by axioms in which / is replaced by equality. When equality is 
replaced by J to obtain Axioms M2 and M3, the weaker statement (a1b)Ja 
can be proved from the remaining axioms, as we have seen. 


LEMMA 2. In an EU Mixture Space, if p € [0, 1], then (apa)Ia. 

Proof.* If p = 0 or 1, (apa)la follows by Lemma | and M2 and an 
application of M1. Suppose p € (0, 1), and suppose aR(apa). Let q = 
(1 + p)~'. By M4, 

(aga) R[ (apa) qa]. 
Hence, by M3 and M1, 

(aga)R[ a(pq)a]. 
Now pq = 1 — q, since q = (1 + p)~'. Thus we have 


(aga) R[ a(1 — q)a]. 


*The author thanks Peter Fishburn for this proof. 
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By definition of J this violates Axiom M2. A similar proof shows that 
(apa) Ra is impossible. a 


We now show that condition (7.26) holds. Assuming aRb and taking 
c = b in Axiom M4 yields 


(apb) R(bpb). 
Using (pb) Ib (which follows by Lemma 2) and Axiom M1, we find 
(apb) Rb. (7.27) 


Next, using c = a and using 1 — p instead of p in Axiom M4, we obtain 
[a(1 — p)a]R[ (1 — p)a}. (7.28) 


By Lemma 2 with 1 — p, by Axiom M2, and by Axiom M1, (7.28) implies 
that 


aR(apb). (7.29) 
Equations (7.27) and (7.29) give us condition (7.26). 


Exercises 


1. Fishburn [1970] Suppose X is a set. A simple probability measure on X 
is a function P, which assigns a real number P(A) to each subset A of X, 
and satisfies the following conditions: 

(id) P(A) 20, for all A SX, 

(ii) P(X) = 1, 
(iii) P(A U B) = P(A) + P(B) when A, BS X and AN B=, 
(iv) P(A) = 1, for some finite A ¢ X. 
If P and Q are simple probability measures on X, and a € [0, 1], then PaQ 
is defined to be the function aP + (1 — a)Q. 
(a) Show that PaQ is again a simple probability measure. 
(b) Suppose R is a binary relation on the set of all simple probability 
measures on X, and suppose P = Q implies P/Q. Show the following: 

(i) The set of simple probability measures on X under the relation 
R satisfies Axiom M2 of an EU Mixture Space. 

(ii) Axiom M1 might fail. (An example analogous to the one in the 
text showing that indifference between lotteries may not be transitive 
suffices to show this.) 

(iii) What about Axiom M3? 

(iv) What about Axiom M4? 

(v) What about Axiom M5? 

(vi) What about the conclusion of Lemma 1? 

(vii) What about the conclusion of Lemma 2? 
(viii) What about condition (7.26)? 
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2. (Fishburn [1970, p. 109]) Suppose & and &, are the following 
lotteries: 


g ee ee | ; 
p you get executed 
g, l—p 


$2 


Use these lotteries to argue against one of the axioms for an EU Mixture 
Space. 

3. (Gudder [1977]) A recipe for bread calls for 8 cups of flour, 1 cup of 
butter, and 2 cups of water. We can mix flour and butter first and then add 
water, or mix butter and water first and then add flour. The results are 
equivalent. From this, conclude that [(F3B)7, W]/[F~(BiW)]. 

4. (Gudder [1977]) In a war, suppose a = losing 100 tanks, b = losing 
100 men, and c = losing the war. Observe that under the preferences of 
most military leaders, Axiom MS fails in the lottery interpretation. 


5. (Gudder [1977]) In a society, action a is considered moderately 
good, action b neither good nor bad, and action c taboo. Observe that 
Axiom MS fails in a natural mixture of actions interpretation. 

6. (Fishburn [1970, p. 109]) (a) Show that the following condition is 
necessary for the representation (7.23), (7.24): if p € (0, 1) and 
(apc) R(bpc), then aRb. 

(b) Suppose lotteries (“, ¢, ¢©, and ¢ are as in the Allais 
Paradox (Section 7.2.5). Show that if ¢“ is preferred to , and (© to ¢®, 
then the condition in (a) is violated. To show this, use the lotteries 


10/11 $5,000,000 


g0>) 1/11 


$0 
and 


6) gy 


and write (“, (, @©, and ¢ as mixtures making use of these lotteries. 
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(c) Show that in the presence of the other axioms, Axiom M4 implies 
the condition in (a). (Thus, the Allais Paradox is an arguinent against 
Axiom M4, as is claimed by Allais.) 


7. Huang [1971] has replaced Axioms M4 and MS by the following 
three axioms: 


Axiom H1. If aRb, then for all p © (0, 1), aR(bpa) and (bpa)Rb. 
AxioM H2. If alb, then for all c € A and p € {0, 1], (apc) I(bpc). 


Axiom H3. If aRbRc and bR(apc), p & (0, 1), then there is q € (0, 1) so 
that q > p and 


bR(aqc). 


Similarly, if aRbRc and (apc)Rb, p € (0, 1), then there is q € (0, 1) so that 
q <p and 
(agqc) Rb. 


She also uses the following axiom from the original list in Fishburn [1970]. 
This axiom is the result of Lemma 1. 


Axiom H4. (a1b)/a. 


(a) Show that Axioms H1 through H4 are necessary for the repre- 
sentation (7.23), (7.24), and hence follow from the axioms M1 through MS. 

(b) Axioms M1 through M3 and HI through H4 together are 
sufficient for the representation (7.23), (7.24). To show this, first show that 
if aRb, then 


P > 4 = (apb)R(aqb). 
(c) Next, using the result of (b), show that 
aRcRb => cI(apb) for some p € (0, 1). 


(d) Verify Axiom M4 by considering cases such as bRcRa and 
bRaRc, etc., and using the result of (c). 
(e) Finally, verify Axiom MS by applying the result of (c). 

8. Exercises 8 through 12 sketch a proof of Theorem 7.5. As a first 
step, one uses the axioms (and Lemmas | and 2) to prove the following 
lemmas. Provide a proof of each. You may assume all of the previously 
proved statements in verifying these lemmas. 

(a) If aRb and 0 Sp <q Sl, then (aqb)R(apb). 
(b) Suppose aSbSc and aRc, where 


xSy = ~ yRx. (7.30) 


Then bi(apc) for exactly one p € 0, 1]. 
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(c) If aRb and cRd and p € (0, 1], then 


(apc) R( bpd). 
(d) If alb and p € (0, 1], then 


(apb) Ia. 


(e) If ab and p € (0, 1], then for all c, 
(ape) I(bpc). 
9. Continuing the proof of Theorem 7.5, prove the following: 
[(agb)p(arb) ]I{ af pq + (1 — p)r]b}. 


10. Continuing the proof of Theorem 7.5, fix c and d in A with cRd and 
consider 


ed = {a © A: cSaSd}, 


where S is defined in (7.30). By (b) of Exer. 8, there is a unique number 
F(a) € [0, 1] for each a € cd, such that 


a I{c[ f(a)]d}, 


and such that f(c) = 1 and f(d) = 0. 
(a) Prove that for all a, b € cd, 


aRb = f(a) > f(b). (7.31) 
(b) Prove that if p € [0, 1] and a, b € cd, 


f(apb) = pf(a) + (1 — p)f(S). (7.32) 


11. As the last step in the sufficiency proof of Theorem 7.5, do the 
following. 

(a) Given a, b € A, show that there are a,, b, € A so that a and b 
are both in the set a,b, = {e € A: a,SeSb;} and so that a,b, 2 cd, where 
c, d are as in Exer. 10 

(b) Show that there is a function f* on a,b, satisfying (7.31) and 
(7.32) on a,b,. 

(c) Let f, be obtained from f;* by a positive linear transformation, in 
such a way that f(c) = 1 and fd) = 0. Show that f, also satisfies (7.31) 
and (7.32) on a,b,. 

(d) Show that if a € a,b, and a € ab, then f(a) = f(a). 

(e) Let E(a) be the common value of f(a) over all i so that a € a,b,. 
Show that E satisfies (7.23) and (7.24) for all a, b € A and p € (0, I]. 
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12. To prove the uniqueness statement in Theorem 7.5, fix c and d in A 
and consider the functions 


_ E(a) ~ Ea) 
F(a) = Fe) — Ed)’ 
(gy = Ela) = EX) 
PO) ~ Be) — Ea) 


Show that F(a) = F’(a), all a, and derive the uniqueness statement. 


13. (Pfanzagl [1968, pp. 217-218]) Suppose (A, R, @) is an EU Mixture 
Space and the function E on A is non-constant and satisfies (7.23) and 


E(apb) = s(p)E(a) +[1 — s(p)]E(5), 


for all a, b in A and p € [0, 1]. The function s(p) can be thought of as a 
measure of the subjective probability of p (see Chapter 8). Show that the 
function s has the following properties. 

(a) s(1 — p) = 1 — s(p), for all p € [0, 1]. 

(b) s(pq) = s(p)s(q), for all p, g € [0, 1]. 

(c) s(1/2) = 1/2. 

(d) s is monotone increasing on [0, 1]. 

(e) s(p) = p, all p. (This is shown using the methods of Chapter 4.) 


7.5 Subjective Probability 


In Section 7.1, we associated with an act or choice a set of possible 


events A,, A,,..., A,, and with each event A we associated a consequence 
c;. The expected utility of the act was defined as 
=p(A;)u(c), 


where p(A,) is the probability that event A; occurs and u(c,) is the utility of 
consequence c;. In passing to the notion of a lottery in Section 7.2, we 
suppressed the distinction between events and consequences. We also 
assumed that the probabilities p(4;), or p; in our lotteries, were known 
beforehand. Often this is not a reasonable assumption. For example, in 
buying fire insurance, we may not know exactly the probability of our 
house catching fire over the period for which we want to buy insurance 
(though our broker may have good estimates). In betting on a horse race, 
we do not know exactly the probability that a particular horse will win. In 
making decisions about alternative sources of energy, we do not know the 
probabilities that various significant events will take place. We often have 
some ideas of how probable different outcomes are, or at least that one 
outcome is more probable than another. In the next chapter, we shall 
discuss how to use this information to make subjective or qualitative 
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assessment of probabilities and we shall mention various applications of 
subjective assessments of probabilities. We shall then return to the EU 
Hypothesis in the situation where the probabilities are subjective. We shall 
study the representation (7.23), (7.24) in the situation where probabilities 
are not known, and mention some of the foundational work of de Finetti 
[1931, 1937], Ramsey [1931], Savage [1954], and von Neumann and 
Morgenstern [1944]. Finally, we shall discuss recent results on the measure- 
ment of subjective probability, and modern applications of such measure- 
ment. 
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CHAPTER 8 
Subjective Probability 


8.1 Objective Probability 


At the end of the last chapter, we observed that sometimes we have 
some information or intuition about probabilities, but do not know them 
explicitly or have no way to calculate them. In this chapter we shall discuss 
how to “measure” our subjective or qualitative notions of probability. For 
comparison’s sake, we shall briefly review in this section the classical 
definition of probability, which we call objective probability. (The reader 
may skip this section if desired.) In the next section, we constrast the 
situation where this notion of probability applies with the situation where 
subjective probability is appropriate. 

Often in science we are concerned with laws of the form: if a certain set 
of conditions C is satisfied, for example if a certain experiment is per- 
formed, then an event A will take place. For example, we assert that if 
water is heated to a certain temperature, it will turn to steam; if supply is 
greater than demand, then prices will decrease; etc. Similarly, we assert 
laws to the effect that if conditions @ are realized, an event A will not take 
place. If the occurrence of event A is inevitable whenever C is realized, we 
shall speak of A as certain (relative to C). If it inevitably will not occur, we 
speak of A as impossible. 

Often we can only assert that event A is likely (or unlikely) to occur if C 
is realized. Thus, for example, we might say that if two sounds are 
sufficiently different in intensity, then it is likely that people will be able to 
distinguish them. We are speaking of an event that is neither certain nor 
impossible. Sometimes conditions © can be realized very often (as in the 
case of heating water), and we can observe that, as a rule, event A occurs a 
certain proportion of the time. For example, we cannot predict whether a 
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given atom of radium will disintegrate during a given interval of time of 
length ¢. But we can say that if we observe enough time periods, and before 
each one we specify a particular atom, then the proportion of such periods 
in which the specified atom will disintegrate is approximately p = 1 — 
e~“', where a is the rate of decay. In situations such as these, it makes 
sense to speak of the (objective) probability of the event A, thinking of the 
probability as the relative frequency or proportion of occurrences of A if 
conditions © are realized a large number of times. 

This notion of objective probability goes back to Pascal and Fermat, who 
corresponded about gambling in the seventeenth century. It has been 
well-developed and formalized in much the following way. Let X be a set 
of outcomes a, b,c,..., each of which either occurs or fails to occur 
whenever the fixed set of conditions C occurs. We call certain subsets of 
outcomes events and let & denote the set of events. A set consisting of a 
single outcome can qualify as an event. So can a more complicated set. We 
assume that if A and B are in &, then 


AUBES& (8.1) 
ANBE& (8.2) 
A-BE& (8.3) 

A° ES. (8.4) 


A U Bis the event that either A or B occurs, A - B is the event that both 
A and B occur, A — B is the event that A occurs but B does not, and A° is 
the event that A does not occur. 

A collection & will be called an algebra of subsets if it is nonempty and 
satisfies Eqs. (8.1) and (8.4). It is easy to see that an algebra of subsets 
must also satisfy (8.2) and (8.3). Moreover, since & is not empty, there is 
some A in &. It follows that A‘ is in &, and hence so is ¥ = A U AC. X 
corresponds to the event that some outcome will occur. We shall assume 
that on any given realization of the conditions C, one and only one of the 
outcomes in X occurs. The empty set © is an event, since @ = A -q A° for 
any event A. The empty event © is the event of none of the outcomes in X 
occurring. The empty event is impossible: we assume that on each realiza- 
tion of the conditions C, one of the outcomes in X occurs. 

We measure the probability of an event A € & (the probability that one 
of the outcomes in A occurs) as a real number p(A). These probabilities are 
assumed to have the following properties: 


P(A) 20 (8.5) 


P(X) = 1 (8.6) 
AN B= P(A VU B) = p(A) + p(B). (8.7) 
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A function p: & —> Re satisfying (8.5), (8.6), and (8.7) is called a probability 
measure. The triple (X, &, p) is called a (finitely) additive probability space 
if X is a nonempty set, & is an algebra of subsets of X, and p is a 
probability measure on &. This axiomatic definition of (objective) proba- 
bility was first stated explicitly by Kolmogorov [1933]. The reader is 
referred to any standard treatment of probability (for example, Feller 
[1962]) for a more detailed discussion of the definition and its implications. 

The important point to make about objective probability is that, in 
principle, it is universal and replicable. Namely, different individuals in 
different parts of the world computing the objective probability p(A) of an 
event A should, if they agree on certain basics (the probability of certain 
elementary events), agree on p(A). 


Exercises 


1. Show that the following are algebras of subsets under the usual 

notions of union and complementation: 

(a) X = Re, & = all subsets of X. 

(b) X = {1,2,..., 2}, & = all subsets of X. 

(c) X = any infinite set, & = all subsets that are finite or whose 
complement is finite. 

(d) X = any uncountably infinite set, & = all subsets that are count- 
able or whose complement is countable. 

(e) X = any nonempty set, & = all subsets of X that contain a given 
nonempty set A or are disjoint from A. 


2. Show that the following define finitely additive probability spaces: 
(a) (X, &) as in (b) of Exer. 1, and p(A) = |A|/n. 
(b) (X, &) as in (c) of Exer. 1 and 
_ {0° if A is finite, 
P(A) | 1 if A¢ is finite. 


1. (c) (X, &) as in (d) of Exer. 1 and 


0 if A is countable. 
A)= , 
P(A) 1 if A‘ is countable. 


3. Let X be a set of more than two elements and A a nonempty subset 
of X. Let & consist of @ and all subsets of X that intersect both A and A‘. 
Show that (X, &) is not an algebra of subsets. 


8.2 Subjective Probability 


In a comprehensive study, Smil [1972, 1974a,b] asked a series of experts 
to estimate the probability of occurrence in the 1970’s of certain “environ- 
inental episodes.” For example, he asked them the probability of the event 
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A, a severe urban air pollution episode lasting several days with significant 
consequences. And he asked them the probability of event B, a widespread 
failure of power supply in a populated, industrial region, lasting several 
hours. (Other events discussed by Smil are shown in Table 8.1.) Now for 
such estimates of probability, the frequency interpretation of Section 8.1 
does not make much sense. It hardly is reasonable to expect to realize the 
conditions C (being in the 1970’s) a large number of times. Rather, here, 
the estimates of probability have a somewhat different interpretation. They 
represent the “degree of certainty” or “degree of conviction” that the 
expert has that an event will occur. They represent what is often called his 
subjective probability of occurrence. Subjective probabilities do not have 
the universal and replicable character that objective probabilities do. 
Different people can have totally different subjective estimates of the 
probability of an event, and an individual can from time to time change 
his own estimate. For a further discussion on objective versus subjective 
probability, see for example Carnap [1950], de Finetti [1937], Fishburn 
[1964], Kemeny [1959], Keynes [1921], Nagel [1939], Savage [1954], or 
Kyburg and Smokler [1964], where many basic papers have been collected. 


Table 8.1. Probabilities of Environmental Episodes in the 1970’s* 
(Median and quartiles in percents.) 


Lower Upper 
Episode Quartile Median Quartile Mode 


Severe urban air pollution episode 
lasting several days with signifi- 
cant consequences 40 90 100 100 


Widespread failure of power 
supply in populated, industrial 
region, lasting several hours 50 70 100 100 


Catastrophe of fully loaded jumbo 
tanker (over 100,000 dwt) and 
spill of crude oil in the open sea 25 70 95 100 


Serious oil spill from offshore 
drilling operation causing eco- 


logical disturbance over a large 
area 20 50 75 20 


Radioactive contamination of en- 
vironment outside of a reactor 
building caused by failure of 
nuclear plant protective systems 5 5 10 5 


*Adapted from Smil [1974b) with permission of the author. 
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Our discussion is based on the observation that an individual often 
makes subjective estimates of the probabilities of certain events, or at least 
comparisons of the subjective probability of one event to that of another, 
even in situations where repetitions make sense and objective probability 
can be calculated. Subjective probability plays a role in a variety 
of applications—for example, in medical decisionmaking (Betaque and 
Gorry [1971], Fryback [1974]), in weather forecasting (Murphy and 
Winkler [1974], Peterson et al. [1972], Sanders [1963, 1967, 1973], Staél von 
Holstein [1971], Winkler and Murphy [1968a]), in assessing stock market 
trends (Bartos [1969], Staél von Holstein [1970a, 1972]), in production 
planning (of electricity and the like) (Kidd [1970]), and in predicting future 
sociopolitical events (Brown [1973]). Subjective probability judgments 
often have little connection with objective probability, as we shall see. On 
the other hand, since judgments of subjective probability are often made, 
especially in the social and policy sciences, it is incumbent upon us to 
understand the nature of subjective probability judgments, and if possible, 
to put these on a firm measurement-theoretic foundation much as judg- 
ments of preference were in earlier chapters. 

We shall describe three approaches to the measurement of subjective 
probability. The first is simple direct estimation: Ask the expert (or an 
individual) to assign to each event a number that represents his subjective 
estimate of the probability of that event. An example of this procedure is 
the data gathered by Smil [1972] to which we have already referred. A 
variant of this approach involves direct estimation of odds, rather than of 
probabilities. Other variants ask for confidence intervals on probabilities, 
or ask for 50-50 points (points above which or below which the outcome is 
equally likely to lie), quartiles or fractiles, and so on. The second approach 
to the measurement of subjective probability uses preferences among 
complex choices or lotteries, where different prizes are given depending on 
the event which occurs after a certain experiment is performed. If utilities 
of the prizes are known, subjective probabilities of the occurrence of 
different events are calculated using the (Subjective) Expected Utility 
Hypothesis. If utilities are not known, subjective probabilities are esti- 
mated indirectly, by reducing to the third approach. This third approach is 
to ask an individual to make judgments comparing his subjective probabil- 
ities of two events. Thus, we deal with a binary relation > on the set of 
events &, with A > B interpreted to mean that “A is judged more probable 
than B.” The binary relation > is called the comparative probability 
relation or the qualitative probability relation. Usually we assume that the 
set of events & forms an algebra of subsets of some set X of outcomes, 
that is, (X, &) satisfies conditions (8.1) and (8.4). We try to assign numbers 
P(A) to each event A € & so that for all A, B € &, 


A>B«=p(A) > p(B). (8.8) 


374 Subjective Probability 8.3 


Usually we also ask that p satisfy Eqs. (8.5) to (8.7), that is, that (X, &, p) 
be a (finitely) additive probability space. If (X, &, p) satisfies Eqs. (8.8) 
and (8.5) to (8.7), then the function p is called a measure of subjective 
probability or a measure of qualitative probability. The important question, 
from a measurement point of view, is to find (necessary and) sufficient 
conditions on the structure (X, &, >) for the existence of a subjective 
probability measure. In the remainder of this chapter, we discuss these 
three approaches to the measurement of subjective probability. For a 
recent survey of the subjective probability literature, see Hogarth [1975]. 
Other surveys are Hampton et al. [1973], Huber [1974], Savage [1971], and 
Staél von Holstein [1970a,b]. 


8.3 Direct Estimation of Subjective Probability 


Various studies are asking for direct estimates of (subjective) probabili- 
ties. For example, in the study of Smil [1972, 1974a,b], each expert referred 
to above was asked to state, for each of a number of “environmental 
episodes,” his opinion of the probability of its occurrence during the 
1970’s. (The pooled data is shown in Table 8.1.) In another part of the 
same study, Smil listed a series of major scientific, technological, and 
management inventions, breakthroughs, and changes in the fields of en- 
ergy systems and environmental protection and asked each member of his 
panel to indicate his opinion of the probability of practical implementation 
during certain given periods. (The pooled results of this experiment are 
listed in the last column of Table 8.2.) 

Similarly, Fryback [1974] asked several radiologists to study excretory 
urograms of the urinary tract and to estimate the probability that there was 
a benign cyst, a malignant tumor, or a normal variant. Sanders [1963] 
asked forecasters to estimate the probability of various weather phenoin- 
ena, such as wind reaching a certain speed, temperature changing by a 
certain amount, or rain falling within a prescribed period. 

Direct subjective estimation of probabilities may be unavoidable in 
many studies, as, for example, in those of Smil or Fryback. However, the 
literature seems to show that when objective probability can be calculated, 
direct subjective probability estimates often differ significantly from objec- 
tive probabilities.* There are well-known examples of the divergence 
between subjective and objective probability in most basic books on 
probability. The classic example is the birthday problem. In a group of 23 
people, the probability that at least two people have the same birthday is 
greater than 1/2, while most people guess it is much lower than 1/2. 
Another common example is the so-called gambler’s fallacy, namely, that 


*Similar observations hold true for direct estimates of confidence intervals, means, 
variances, quartiles, and so on. For a discussion, see, for example, Hogarth [1975]. 
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Table 8.2, Probability of Practical Implementation of Various Energy Systems 

Breakthroughs* 
(The dates a—b-c in the last column in the row corresponding to event A are 
interpreted as follows: 25% of the respondents (the optimists) felt that there was at 
least a 50% chance that event A would occur before year a; 25% of the respondents 
(the pessimists) believed that this was a possibility (of probability = 50%) only 
after the year c; and 50% of the respondents believed this was a possibility (of 
probability = 50%) first between the years a and c, with a median date of b. Items 
are ordered according to median date.) 


Number Item Quartiles 
1. Fuel cells for small-scale power generation 1980- 1980-1987 
2. Use of nuclear explosives in the production of 
natural gas and oil, geothermal heat, etc. 1980- 1980-1993 
3. Coal gasification or liquefaction 1979- 1982-1984 
4. “Fail-safe” nuclear power generation 1976- 1983-1995 
5. High-temperature gas reactors (A-K cycle) 1979- 1984-1990 
6. Extra-high-voltage transmission on very long 
distances (at least 1000 kV and 1000 km) 1979- 1985-1990 
7: Fast breeder reactors 1981~1985-1990 
8. Cryogenic transmission systems using under- 
ground superconducting cables 1983- 1985-1995 
9. Large-scale shale oil recovery 1983- 1986-1996 
10. Fossil fuel-fired magnetohydrodynamics 198 1~1988—1990 
11. Development of all practically feasible hydro- 
electric sites in populated regions 1982- 1988-2000 
12. Techniques for economical recovery of 
additional 25% of crude oil from 
known resources 1983—1988- 1998 
13. Fully automated underground coal mining 1983~ 1988-2000 
14. Cryogenic pipeline transportation of natural gas 1986-1988-—2000 
15. Simple solar furnace for home power generation 
in tropical and subtropical regions 1986-— 1990-2000 
16. Low-cost high-voltage underground transmission 1988-—1990-2000 
17. Microwave power transmission 1990- 1993-2000 
18. “Fail-safe” systems for drilling and producing 
hydrocarbons at any water depth 1987- 1995-2002 
19. Direct conversion—thermionics 1985-1998-2010 
20. Utilization of low thermal difference systems 1990-1999-—Never 
21. Controlled thermonuclear power 1990-—2000—2000 
22. Efficient storage of electric energy in large 
quantities 1990-2000-2010 
23. Laser power transmission 1990—2000-2010 
24. Large and efficient tidal power plants 1992-—2000-Never 
25. High-temperature gas reactors with thermal cycle 
other than helium 2010—2010—2020 
26. Widespread use of geothermal power 1990-2020-Later 
27. Relay of solar energy via satellite collectors 2000-2020-Later 
28. Solar energy devices for bulk power generation 2000-Later—Never 
29. Cryogenic superfluid transportation of mechanical 
energy on long distances 2020-—Later— Never 
30. Utilization of gravitational energy (anti-gravity) Later—-Later—Never 


*Adapted from Smil [1974b] with permission of the author. 
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after a sequence of heads, a tail on the next trial becomes more likely. 
Kahneman and Tversky [1972] give many other examples, taken from a 
series of experiments. In one experiment, 75 out of 92 subjects said that in 
a survey of families with six children, the exact order of birth of boys and 
girls would be more likely to be GBGBBG than BGBBBB. (These 
sequences are about equally likely, though not exactly since the number of 
boys and girls born is not equal.) The subjects were told that 72 families in 
a given city had the first sequence of children and were asked to estimate 
how many families had the second sequence. The median estimate was 30. 
In another experiment of Kahneman and Tversky, subjects were asked the 
probability of finding more than 600 boys in a sample of 1000 babies, and 
the probability of finding more than 60 boys in a sample of 100. These 
probabilities were judged equal, though the latter is much more likely. In 
general, in comparable probability problems, subjects made estimates 
which were independent of sample size, violating properties of objective 
probability. In other studies, subjects have tended to overestimate the joint 
probability of independent events, and to underestimate the probabilities 
of disjunctive events such as “at least one ....” (See, for example, Bar- 
Hillel [1973].) 

Studies of subjective probability which ask subjects to estimate subjec- 
tive probability indirectly seem to confirm these observations. Preston and 
Baratta [1948] made the first attempt to measure subjective probability 
experimentally, using a simple auction game. They found that small 
objective probabilities (less than 0.20) were systematically overestimated 
and others (greater than 0.20) were systematically underestimated. Other 
studies using indirect estimates exhibit different types of differences be- 
tween subjective and objective probabilities, though the Preston—Baratta 
results have been obtained in a variety of studies. For a summary of such 
studies, the reader is referred to Hogarth [1975, p. 274] and Luce and 
Suppes [1965, Section 4.3]. 

There are several serious problems with direct assessment of subjective 
probability. How one asks for the assessment can lead to significant 
discrepancies. For example, asking for odds can lead to different results 
than asking for numbers between 0 and 1. Offering payoffs can affect 
direct estimates (Phillips and Edwards [1966]). Betaque and Gorry [1971] 
discuss the effects of the likelihood of mistreatment on doctors’ estimates 
of probabilities in diagnosis. In spite of these problems, direct assessment 
of probabilities is becoming an increasingly popular decisionmaking tool in 
government and industry, and so procedures for making these estimates 
have to be better understood in the future. 

Before leaving this section, we should ask what kind of a scale direct 
estimates of subjective probability define. As we mentioned in Chapter 2, 
where there is no obvious representation, admissible transformations must 
be defined as functions preserving the empirical information depicted by a 
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scale, and hence scale type is not formally defined, but depends on our 
interpretation of what the empirical information content of the scale is. 
There does not seem to be a treatment of scale type in the literature of 
direct estimates of subjective probability. However, a discussion of scale 
type will be important in analyzing studies such as Smil’s. We begin some 
of this analysis here. 

It is tempting to suggest that direct estimates of subjective probability 
define an absolute scale. Both the zero point and the unit are fixed. 
However, if the sum of judged probabilities by an individual is not equal to 
unity, it may only make sense to treat the data as a ratio scale to be 
properly normalized.* In Section 8.5 we shall obtain some representation 
theorems for subjective probability. The representations will, in some 
cases, give rise to absolute scales. However, in other cases, they will not 
even lead to ratio scales. 

Let us apply these observations. Many investigators seek a pooled 
probability assessment from a group of assessors. Following Stone [1961], 
the most frequent method used for pooling probability assessments is to 
take a weighted average 


G(A) = Zwp(A), 


where G(A) is the group estimate of the probability of A, p,(A) is the 
estimate of the probability of A by the ith assessor, and w, is a weighting 
factor, with w; 2 0 and =w, = 1. If all the weights are equal, this is the 
mean estimate. Smil uses M(A), the median of the p,(A), as his pooled 
estimate. 

Smil makes statements to the effect that the group’s estimate of the 
probability of A is greater than its estimate of the probability of B. This is 
the statement 


M(A) > M(B). 


It is not hard to see that this comparison is meaningful if each p, is the 
same ratio scale. Comparison of means is meaningful in this case also; that 
is, the statement 


G(A) > G(B) 


is meaningful. However, if we treat direct estimates of subjective probabil- 
ity as defining a ratio scale, then perhaps it is reasonable to allow different 
transformations of each expert’s scale, in which case comparison of 
medians or of means is not meaningful. 

In pooling the data of Table 8.2, Smil makes the statement, “25% of the 
pA) are at least 1/2.” This statement is meaningless, even if each p, is the 


*Amos Tversky (personal communication). 
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same ratio scale. However, it is meaningful if each p, is considered an 
absolute scale, in which case no admissible transformations (but the 
identity) are allowed. 


Exercises 


1. (a) Show that the statement M(A) > M(B) is meaningless if the p, 
are possibly different ratio scales. 
(b) Do the same for the statement G(A) > G(B). 


2. Consider the meaningfulness of the following statements based on 

Table 8.1. 

(a) The median probability estimate of a severe air pollution episode 
was higher than the median estimate of a widespread power failure. 

(b) The former median was less than twice the latter median. 

(c) The lower quartile estimate of the probability of a serious oil spill 
was less than the median estimate of this probability. 

(d) The median estimate of the probability of a widespread power 
failure was equal to the median estimate of the probabihty of a catastrophe 
with a fully loaded tanker. 


3. Consider the meaningfulness of the following statements based on 
Table 8.2. 

(a) The pessimistic date 1987 for fuel cells for small-scale power 
generation (item 1) is earlier than the pessimistic date 2000 for low-cost 
high-voltage underground transmission (item 16). 

(b) The median (“b”) date for fail-safe nuclear power generation 
(item 4) is earlier than that for laser power transmission (item 23). 


4. In their assessment of alternatives for Mexico City’s future airport 
expansion, de Neufville and Keeney [1972] and Keeney [1973] obtained 
many judgments by the Ministry of Public Works. A typical statement was 
the following: If no new location is chosen for the airport, then by 1975, 
the probability that the number of people impacted by a noise level of at 
least 90 CNR (a measure of noise impact) will be less than 640,000 is 
one-half. Discuss the meaningfulness of this statement (assuming CNR 
defines an absolute scale). 


5. If an individual makes many direct estimates of probability of (simi- 
lar) events, one would like a way to “score” how well he does against the 
objective probability or real frequency of occurrence. One scoring rule due 
to Brier [1950] and Brier and Allen [1951], which has been used in weather 
forecasting and elsewhere (e.g., Sanders [1963]), goes as follows. Let 


Boat ou (fP - 0,” 
N i=1 i i > 


where f is the forecast probability by the jth forecaster that event i will 
occur (or the event in question will occur on the ith occasion), O, is 1 if the 
ith event occurs (or the event in question occurs on the ith occasion), and 0 
otherwise, and N is the number of forecasts made by the /th forecaster. 
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The Brier Score B“ is 0 if the forecaster is perfectly correct, and | if the 
forecaster is as wrong as can be. (For a discussion of other scoring rules, 
see Hogarth [1975], Winkler [1969], Winkler and Murphy [1968b], and de 
Finetti [1962}.*) 

The Brier Score has been used to measure performance of weather 
forecasters who deal with such problems as the probability that the wind 
speed will reach a certain critical level within a certain time, that the 
temperature will change by at least a certain amount within a certain time, 
that it will rain within a certain time, etc. 

(a) Is it meaningful to say that one individual forecaster scores better 
than another, that is, that BY < B®? 

(b) Is it meaningful to say that the (arithmetic) mean score of one 
group of forecasters is better than the (arithmetic) mean score of another 
group? Z 

(c) Let f, be the arithmetic mean of the group estimates f over /, 
and let 


= N. : 
= ZU < 0) 


2|- 


be the group’s score, if the group mean is taken as the group forecast. 
Sanders [1963] claims that, from data he analyzes, the group score is better 
than the mean of the individual scores; that is, 


= K . 
B< BY, 


ml 


J 


where K is the number of assessors in the group. In fact, B is better than 
each BY) and B is more than 5% better than the best individual score. He 
takes this as evidence that groups make better forecasts than individuals. 
Consider the meaningfulness of these assertions. 


8.4 Subjective Probability from Preferences among Lotteries: 
The SEU Hypothesis 


8.4.1 If Utility Is Known* 


Sometimes it is possible to estimate subjective probabilities from prefer- 
ences among choices, acts, or lotteries. The procedure we shall describe is 
used frequently in experimental situations, going back to the work of 
Davidson, Suppes, and Siegel [1957]. The basic idea goes back to Ramsey 


*Much of the literature of scoring rules has been concerned with the design of rules that 
encourage the assessor to be careful in his assessments and to report only his true beliefs, 
rather than to give answers that will improve his score. 

'This subsection and the next are based on Luce and Suppes [1965, Section 3.3]. 
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[1931]. Surveys of the approach can be found in Goodman [1973], Hamp- 
ton et al. [1973], Huber [1974], Luce and Suppes [1965], and Staél von 
Holstein [1970a]. We shall make use of the version of the Expected Utility 
Hypothesis which uses subjective probabilities. This Subjective EU Hy- 
pothesis, or SEU Hypothesis, says that individuals make choices as if they 
were subjectively calculating both probabilities and utilities, and choosing 
that alternative with the larger (subjective) expected utility. We make use 
of the concepts and notation of Section 7.2.1.7 

Suppose we wish to estimate the probability p(A) of an event A. We 
shall first see how to do it if we know low to measure the utility of 
consequences. Suppose we compare the two choices ? and 0’ described as 
follows: 


€: either event A occurs or it does not. If A occurs, the outcome is c, 
and if A does not occur, the outcome is c’. 

¢’: either event A occurs or it does not. If A occurs, the outcome is d, 
and if it does not occur, the outcome is d’. 


If you are indifferent between ¢ and 0’, then it follows by the SEU 
Hypothesis that E(?) = E(¢’), that is, that 


P(A)u(c) + p(A°)u(c’) = p(A)u(d) + p(A°)u(d’), 


where p gives the subjective probability we are trying to discover, and wu is 
the known utility. Assuming that u(c) — u(c’) + u(d’) — u(d) #0, and 
using p(A‘°) = | — p(A), we conclude that 


= u(d’) — u(c’) 
PAY oe) = wea oY) 
Thus, if consequences c, c’, d, and d’ can be found so that choices ¢ and 0’ 
are judged indifferent and u(c) — u(c’) + u(d’) — u(d) # 0, then subjec- 
tive probabilities can be calculated from known utilities, using Eq. (8.9). 
Another method for finding subjective probabilities from known utilities is 
described in Exer. | below. 


8.4.2 If Utility Is Not Known 


There is an alternative, not quite so direct approach to determine 
subjective probabilities from preferences among choices, which does not 
require explicit knowledge of a utility function over consequences. This 
procedure is to derive the comparative probability relation > of Section 


tWe shall not distinguish between the Expected Utility Hypothesis and the Expected Value 
Hypothesis, or between utility functions and value functions. 
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8.2 using the SEU Hypothesis and preferences among acts or choices, and 
then to use the methods of Section 8.5 to calculate subjective probability 
from >. Suppose A and A’ are two disjoint events and B is the event 
(A U A’. Let c, c’, and d be consequences such that c is preferred to c’. 
Note that we do not need to know how to calculate utilities over con- 
sequences to find such c and c’. However, we do know that if u is a utility 
function over K, the set of consequences, then u(c) — u(c’) > 0. Let 
choices ¢ and (’ be defined as follows: 


€: either event A, event A’, or event B occurs, and the consequences 
are, respectively, c, c’, and d. 

¢’: either event A, event A’, or event B occurs, and the consequences 
are, respectively, c’, c, and d. 


Then if R is strict preference among choices, we have by the SEU 
Hypothesis: 


CRE’ <> p(A)u(c) + p(A’)u(c’) + p(B)u(d)> 
P(A)u(c’) + p(A’)ju(c) + p(B)u(d) 
<= p(A)u(c) + p(A’)u(c’) > p(A)u(c’) + p(A’)u(c) 
> p(A)[u(c) — u(e’)] > p(A)[u(c) — u(c’)] 
= p(A) > p(A’) (since u(c) — u(c’) > 0) 
@A>A’. 


Thus, we can derive > among disjoint events from preferences on choices 
(lotteries). From > among disjoint events we can easily derive it among 
nondisjoint events, if we are willing to assume the following monotonicity 
axiom: if AM B= AMC #=2Q, then 


B>CoAUB>AUC. 


We shall encounter this axiom again in the next section, where we shall 
discuss it. To see the role it plays, consider events D and E. We can write 


D=(DN £E)U(DN E‘) 
and 

E=(DN £E)U(D‘n £). 
By the monotonicity axiom, 


D>EeSDn E‘>D'n £E, 
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and the events D E‘ and D‘ (- E are disjoint. Thus, we can determine 
> between all pairs of events. Note that in our determination of >, we do 
not really need to know the utilities of consequences, but only that there 
are consequences c and c’ such that u(c) > u(c’), that is, such that c is 
preferred to c’. 


8.4.3 The Savage Axioms for Expected Utility 


In Section 7.5, we observed that often one must make choices between 
complex acts or lotteries when the probabilities are not known. We have 
already seen that we can derive subjective probabilities from preferences 
among such acts, provided we invoke the SEU Hypothesis and we know 
how to measure utility. We have also seen that we can derive the qualita- 
tive probability ordering, even if utilities are not known. One very interest- 
ing approach to utility theory states conditions on preferences among acts, 
which are sufficient to allow the derivation of both a subjective probability 
measure and a utility function, which together satisfy the SEU Hypothesis. 
This approach, which builds on earlier work of de Finetti [1931, 1937], 
Ramsey [1931], and von Neumann and Morgenstern [1944], is due to 
Savage [1954]. We briefly discuss it. Good treatments of Savage’s approach 
can be found in Luce and Raiffa [1957] and Fishburn [1970]. 

Suppose (X, &) is the algebra of events where & consists of all subsets 
of X, and K is a set of consequences. Generalizing our discussion of 
Chapter 7, an act or choice will be thought of as a function f which assigns 
to each element of X a consequence of K. A finite act or choice f is an act 
with the property that there is a partition A,, A,,..., A, of X, with each 
A, € &, and there are elements c), ¢,,...,¢, of K, so that on A,, f always 
assigns the consequence c,. These are the types of acts or choices or 
lotteries we studied in Chapter 7. We shall denote such a finite act as 


F(A, Ag «2, Ags Cys Cy ~~ > 
Let @ be the set of acts and let R be a binary relation on @ interpreted as 
strict preference. Savage states axioms on the system (X, &, K, @, R) 
sufficient to guarantee the existence of real-valued functions p on & and u 
on K satisfying the following conditions: 
(X, &, p) is a finitely additive probability space (8.10) 


and 


IRf' <> E plAule) > = p(B)u(d), (8.11) 
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for all finite acts 
f= (Aj Ay ss Ag tte sss &); 
St’ = f(B,, By ..., Bai Gy, dy... d,,) 
in @.* If utility is bounded, we may define expected utility by 
EUS) = E p((x)ul S00) 


As Fishburn [1970, p. 194] points out, Savage’s axioms actually imply that 
utility is bounded, and for all f, f’ € @, 


JRf' <= EU(f) > EU(f’). 


Suppose > is a binary preference relation on K. If we identify a con- 
sequence c with the act f(X; c), then it is reasonable to assume that 


c>c’ a f(X; c)RI(X; c’). 


To use a notion from Chapter 7, we say that (@, R) extends (K, >). If 
(@, R) extends (K, >), it follows from (8.10) and (8.11) that u is an 
order-preserving utility function for (K, >). Presentation of Savage’s 
axioms would take us too far afield, and so we simply refer the reader to 
one of the references above for such a presentation. See also Exers. 8 and 
9. 

A different approach than Savage’s, with inuch the same goals, can be 
found in Luce and Krantz [1971] (see Krantz er al. [1971, Chapter 8). 

Sometiines a representation (8.10), (8.11) is sought with utility u additive 
with respect to an operation 0. Axioms for this representation and a 
related one are studied in Roberts [1974] and in Luce [1972]. 


Exercises 


1. If utilities are known, and if A is an event whose probability is 
unknown, you can estimate the subjective probability p(A) as follows: 
Find consequences c and c’ and a lottery ¢ with known probabihties, 
which is judged indifferent to the lottery (’ which gives c if A occurs and c’ 
if A does not occur. Show that p(A) can be calculated from the EU 
Hypothesis and E(¢). 


*The representation (8.11) alone can be looked at as a special case of the polynomial 
conjoint measurement representation studied in Section 5.5.1. 
tSee Section 7.2.2 for the definition of order-preserving utility function. 
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2. Suppose u($n) = n. Suppose you wish to estimate your subjective 
probability that it will be fair weather tomorrow. Show how you can do 
this if you are indifferent between the following lottery @ and the lottery 
that gives you $1 if it is fair tomorrow and takes $1 from you if it is not: 


3. Suppose there are three candidates running for election, a Repub- 
lican, a Democrat, and an Independent. Using dollar payoffs $100, $200, 
and $300, design lotteries to determine whether an individual thinks 
election of the Republican is more likely than election of the Democrat. 


4. In the situation of Exer. 3, suppose u($n) = n. Discuss how to use 
the methods of Section 8.4.1 to determine the subjective probability of a 
Republican’s being elected. (© can be taken to be all subsets of the set 
{Republican, Democrat, Independent}.) 


5. Suppose u(a) # u(b) and you are indifferent between the two acts 
J[A*, (A*; a, 6] and f[A*, (A*); 5, a], to use the notation of Section 
8.4.3. Show from the SEU Hypothesis that you think A* and (A*) are 
equally likely. 

6. Show that under the SEU Hypothesis, 


f[A*, (A*) @, d| Rf[ A*, (A*)*; b,c] <= ula) — u(b) > ufc) — u(d). 


(If u is unknown, we can define a quaternary relation D on the set of 
consequences K by 


abDed <> f{ A*, (A*)*; a, d|Rf| A*, (A*)*; b,c]. 


Then we can use the methods of Section 3.3 to find u. This idea goes back 
to Ramsey [1931].) 


7, (Iversky [1967]) Let X be a set of events and C a set of monetary 
amounts. For every x in X and c in C, consider the act or choice or gamble 
that gives payoff c if x occurs and 0 if x does not occur. Let L be the set of 
all such gambles and R a binary relation of preference on L. Then L can 
be thought of as a Cartesian product X x C. Show that there are functions 
p: X — Re and u:C => Re satisfying the SEU Hypothesis if and only if the 
pair (L, R) =(X X C, R) satisfies additive conjoint measurement (Sec- 
tion 5.4). 
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8. One of Savage’s axioms is the following: Suppose (@, R) extends 
(K, >). Fix c and c’ so that c>c’. Define >* on & by 


A>*A’ @ f(A, AS, RELA (A) 6, c’]. 


Let A = *A’ hold if and only if ~[A’>*A]. Then for all A, A’ € &, 
either A >*A’ or A’ = *A. Show that if (@, R) extends (K, >), then this 
axiom follows from the representation (8.10), (8.11). (The relation >* is 
the comparative probability relation.) 

9. Another of Savage’s axioms is the following: Suppose f and f’ agree 
on A, g and g’ agree on A, f and g agree on A‘, and f’ and g’ agree on A°. 
Then 


Rg <= f'Re’. 


This says that preference between two acts should not depend on those 
outcomes that have identical consequences for the two acts. Show that this 
axiom follows from the representation (8.10), (8.11). 


10. Consider the following acts: 


f =Sf(C, D, E; a, b, b), 
g=f(C’, D, E’; a, b, b), 
f' =f(C, D, E; a, a, b), 
2’ = f(C’, D, E’; a, a, b). 


Recall that C D, E partition X and C’, D, E’ partition X. Show that 
according to the axiom in Exer. 9, with A = (C q E’) U (C’ N B), 


fRg & f'Re’. 


(However, as Krantz et al. [1971, p. 210] point out, if D adds more to C’ 
than it does to C, we could have fRg and g’Rf’. They illustrate this point 
with the following game, adapted from Ellsberg [1961]. Consider three 
urns, one containing 200 white balls, one containing 200 black balls, and 
one containing 100 red balls. A coin is flipped, and, depending on its 
outcome, one of the first two urns is selected. Without informing the 
player, the balls from the selected urn are mixed with the balls from the 
third urn, the red balls. The player draws one ball from the mixture. 
Suppose a denotes a valuable prize, and b no prize. Consider the gambles 
with consequences based on the color of the ball drawn: 


f = f(Red, Black, White; a, b, b) 
and 


g = f(White, Black, Red; a, b, b). 
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In each case, the (objective) probability of winning prize a is 1/3. How- 
ever, the player may prefer f to g. At least in f, he knows that after the 
toss of a coin, he has a chance of winning. In g, his action after the coin 
toss might have no effect, if the black urn had been chosen in the coin toss. 
Compare the gambles 


f’ = f(Red, Black, White; a, a, b) 
and 
g’ = f(White, Black, Red; a, a, b). 


Now, the (objective) probability of winning prize a is 2/3 in each case. 
However, in f’, the player’s chances of winning with his own action, after 
the coin toss, can go as low as 1/3 (if the white urn was chosen), while in 
g’, his chances of winning by his own action are always 2/3, regardless of 
the urn chosen as a result of the coin toss. Thus, he may prefer g’ to f’. 
Both Ellsberg [1961] and Raiffa [1961] reported that sophisticated subjects 
(in informal questioning) had exactly the preferences fRg and g’Rf’.) 

11. The Ellsberg example of Exer. 10 can be thought of as an argument 
against the additivity of subjective probability (and against the Monotonic- 
ity Axiom [Eq. (8.16)] we shall introduce in Section 8.5.1). For, let 
An B=Afn C®#=GQ, and consider 


f= f(B, A, (A U B)*; a, b, b) 
g=f(C,A,(A U C); a,b, b) 
Sf = f(B, A, (A U B)*; a, a, b) 
g =f(C,A, (AU C)*; a, 4, db). 
Then show that fRg can be interpreted as “B is subjectively more probable 


than C” and g’Rf’ can be interpreted as “A U C is subjectively more 
probable than A U B.” 


8.5 The Existence of a Measure of Subjective Probability 
8.5.1 The de Finetti Axioms 


Let us return to the algebra of events (X, &) and the binary relation of 
comparative probability > on &. The comparative probability relation > 
was apparently first studied by Bernstein [1917]. Later references include 
Keynes [1921], de Finetti [1931, 1937], Koopman [1940a,b], Carnap [1950], 
Savage [1954], Suppes [1956], Kraft, et al [1959], Scott [1964], Luce 
[1967, 1968], Domotor [1969], Fishburn [1969, 1970, 1975], Fine 
[1971la, b, 1973, 1977], Kaplan [1971, 1974], Roberts [1973], Narens [1974], 
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Fine and Gill [1976], Suppes and Zanotti [1976] and Fine and Kaplan 
[1977]. 

We shall seek conditions on (X,&, >) sufficient to guarantee the 
existence of a subjective probability measure p, that is, a real-valued 
function p on & so that (X, &, p) is a finitely additive probability space 
and so that for all A, B in &, 


A>B = (A) > p(B). (8.8) 


(Some authors have studied a weakening of condition (8.8) to the repre- 
sentation 


A>B = p(A) > p(B). 


For references, see for example Savage [1954], Fine [1973], Fishburn 
[1969, 1975] and the recent work of Cohen [1978].) Calculation of a 
subjective probability measure based on judgments of comparative proba- 
bility is again subject to the difficulties we have discussed with the other 
methods: how questions are asked, availability of payoffs, and extraneous 
information can affect answers. 

We shall use the notation 


A>Bow~(B >A) (8.12) 
and 


A~ Be ~(A>B)& ~(B>A) (8.13) 


A > B means that A is judged at least as probable as B, and A ~ B means 
A and B are judged equally probable. 

The following conditions on (X, &, >) are necessary for the existence 
of a subjective probability measure: for all A, B, and C in & , 


(&, >) is a strict weak order, (8.14) 

X >MandA > 2, (8.15) 

IfAN B=ANC=2Z,thnB>CSeAUB>AUC. (8.16) 
Condition (8.16) is sometimes called a Monotonicity Axiom, and it is 
similar to the Monotonicity Axiom we encountered in studying extensive 


measurement in Section 3.2. That Condition (8.14) is necessary follows 
directly from Eq. (8.8). To see that Condition (8.15) is necessary, note that 


1 = p(X) = p(X U @) = p(X) + p(O) = 1 + p(D), 
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So 


P(D) = 0 < p(X), 
so X >. And, using Eqs. (8.5) and (8.8), 


p(A) 20> p(A) 2 p(@) > A >@. 
Finally, to see that Condition (8.16) is necessary, note that if 


ANB=ANC=ZQ@, 


then 
p(B) > p(C) = p(A) + p(B) > p(A) + p(C) @ p(A U B) >p(A U C). 


Conditions (8.14) through (8.16) were apparently first listed by de Finetti 
[1937], and they are called the de Finetti axioms. 

Before going further, let us discuss whether the de Finetti axioms are 
reasonable ones for judgments of comparative probability. The first two 
axioms seem to be accepted in most studies. However, we can question 
Condition (8.14) in much the same way that we have questioned whether 
preference is a strict weak order. In particular, much as we did in Chapter 
6, we can question whether or not the relation A ~ B, “seems equally 
probable,” is transitive. This would follow if (6, >) were strict weak. The 
third axiom, (8.16), has been questioned by writers such as Edwards [1962], 
who argues that 


P(A U B) = p(A) + p(B) 


for A M B = Gis not necessarily satisfied by subjective probabilities. This 
is really an argument against the representation we are studying. See Exer. 
11 of Section 8.4 for another argument against (8.16). 


8.5.2 Insufficiency of the de Finetti Axioms 


De Finetti asked whether his axioms were sufficient to guarantee the 
existence of a subjective probability measure. One example showing they 
are not was given by Savage [1954, p. 41]. Savage’s example used an 
infinite collection of events &. Later, Kraft, Pratt, and Seidenberg [1959] 
gave a counterexample with a finite set &. We present their counterexam- 
ple. 

Let X = {a, b,c, d, e}, and let & be the collection of all subsets of X. 
Choose € such that 0 < « < 1/3, and define p on X by 


p(a)=4-«, p(b)=1-<«, P(c)=3- <6, p(d) = 2, p(e) = 6. 
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Extend p to & by using additivity. Let > be the strict weak order induced 
on & by p, using Eq. (8.8). Then (X, &, >) satisfies Conditions (8.14) 
through (8.16), the de Finetti axioms. [If the reader wants, he can “normal- 
ize” p by dividing by 16 — 3e, which is p(X).] 

Note that p({d,e}) =8 and p({a, b,c}) = 8 — 3, so p({d, e}) > 
P({a, b, c}). We observe that there is no set A # {d, e} or {a, b, c} such 
that 

{d,e}>A > {a,b,c}. 


For if there is such an A, then 8 2 p(A) > 7. The reader can easily 
convince himself that none of the other subsets of X has such a value. Now 
let >’ be obtained from > by changing the order of {d, e} and {a, b, c}. 
Since there is no element in between {d, e} and {a, b, c}, no additional 
changes need be made. The new order >’ still satisfies the de Finetti 
axioms. Conditions (8.14) and (8.15) are trivial. Condition (8.16) follows, 
since there is no A # © such that A q {d,e} = A 12 {a, b,c} = SO. There 
is no subjective probability measure p’ on (X, &, >’). For, suppose there 
is. Note that 


{a} >’{b, c}, 
{c,d} >’{a, b}, 
and 
{b, e} >’{a, c}. 
Thus 
p’(a) > p’(b) + p’(c), 
p’(c) + p’'(d) > p’(a) + p’(b), 
and 


p’(b) + p’(e) > p’(a) + p'(c). 
Adding these three inequalities and canceling p’(a) + p’(b) + p’(c) gives 
p’(d) + pe) > p’(a) + p’(b) + P(e), 
or 
{d, e} >’{a, b, c}, 


which is a contradiction. Thus, the de Finetti axioms are not sufficient to 
imply the existence of a subjective probability measure, even if X is finite. 
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8.5.3 Conditions Sufficient for a Subjective Probability Measure 


Let us return to the de Finetti axioms and ask for additional axioms 
sufficient to guarantee the existence of a subjective probability measure. 
As in Section 3.1.4, some additional restrictions on the binary relation are 
required. Indeed, suppose that &* is the collection of equivalence classes 
in & under the equivalence relation ~ defined by 


A~Bae~(A>B)& ~(B>A). (8.13) 


Suppose A* denotes the equivalence class containing A and suppose that 
A*>*B* holds if and only if A > B. If (6, >) is strict weak, Theorem 1.4 
implies that >* is well-defined. It follows from Theorem 3.4, Corollary 1, 
of Section 3.1.4 that if there is a measure of subjective probability, then 
(&*, >*) must have a countable order-dense subset. The de Finetti 
axioms plus this new assumption are still not sufficient to guarantee the 
existence of a measure of subjective probability. The Kraft—Pratt—Seiden- 
berg example still applies. 

There are a number of additional axioms which, when added to the 
de Finetti axioms and the assumption that there exists a countable order- 
dense subset of (&*, >*), are sufficient to guarantee the existence of a 
measure of subjective probability. Most of these amount to a rather strong 
assumption, which goes back to de Finetti [1937] and Koopman [1940a, b], 
and is also used by Savage [1954]. This extra assumption says that for all n, 
there is an n-fold almost uniform partition of &, a collection of sets 
X,, X,,...,X, in & that are disjoint and whose union is X (that is, a 
partition of X) and such that whenever | < k <n, the union of no k of 
these sets is more probable than the union of any k + 1 of them. 

An n-fold uniform partition of & is a collection of n disjoint sets 


X\, X,...,X,,in & whose union is X and such that for all i, j, X; ~ X;. If 
& has an n-fold uniform partition, then it has an n-fold almost uniform 
partition. 


If for all x, & has an n-fold almost uniform partition, then it follows that 
& is infinite. Hence, assuming that & has such a partition rules out 
examples like that of Kraft—Pratt—Seidenberg. Even for infinite &, this is a 
rather strong assumption. 

The following theorem is due to Roberts [1973] and is based on one of 
Fine {197la]. For other results, see Savage [1954], Suppes [1956], Kraft, 
Pratt, and Seidenberg [1959], and Luce [1967]; see the book by Fine [1973] 
for an extensive discussion of related results. See also Sections 8.5.4 and 
8.5.5 and Exers. 14 through 17. 


THEOREM 8.1. Suppose X is a set, & is an algebra of subsets of X, and > 
is a binary relation on &. Then the following conditions are sufficient to 
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guarantee the existence of a subjective probability measure on (X, &, >). 
(a) The de Finetti axioms [Conditions (8.14) through (8.16)]. 
(b) (& *, >*) has a countable order-dense subset. 
(c) For all n, there is an n-fold almost uniform partition of &. 


The proof of this theorem is omitted. A major objection to the theorem 
is that it does not apply to finite situations. In the next two subsections and 
in Exers. 14 to 17, we discuss how to remedy that problem. We close this 
subsection with a uniqueness theorem. 


THEOREM 8.2. Suppose X is a set, & is an algebra of subsets of X, and > 
is a binary relation on & . Suppose p and p' are two measures of subjective 
probability on & , and for all n, there is an n-fold almost uniform partition of 
&. Then p = p’. 


Proof. The proof mimics the proof of the uniqueness theorem for 
extensive measurement (Theorem 3.7 of Section 3.2.3). (The basic idea is 
that without the assumption p(X) = p(X) = 1, we can show p’ = ap, 
some a > 0. Then this extra assumption gives us a = 1.) Let us assunie 
that p #p’. Then for some A, p(A) #p’(A), so without loss of generality, 
p’(A) < p(A). Hence, there are positive integers m and n such that 


P(A) <m/n < (m+ 2)/n < p(A). (8.17) 


Now let X,, X2,.... X,, be an n-fold almost uniform partition of & . Since 
the numbers p‘(X,), p’(X2), ... , p’(X,,) sum to 1, the average sum of m of 
these numbers is m/n. Hence, there are X;, X,,,..., X;, such that 


P(X%,) + P(X%,,) So +p'(X;,) 2 m/n. 


Similarly, there are X,, X;,...,X, 


“ina, SuCh that 
im +2 


P(X,,) + P(X,) + + + P(X, 


JIm+2 


) S(m + 2)/n. 
By the almost uniformness of the partition, 
P(X, U X2U + + UXm41) 2 P(X, UX, U + UX,) 2 m/n 
(8.18) 
and 


P(X, U X2U - * UXm41) SACK, UX, + UX, 


Im+2: 


) &(m + 2)/n. 
(8.19) 


392 Subjective Probability 8.5 


Thus, by (8.17), (8.18), and (8.19), 
P(A) <p(X, U X,U + + + UX 41) (8.20) 
and 
P(X, UX, U +++ UXm41) <p(A). (8.21) 
But now (8.20) implies that 
X,UX,U +++ UX,4,>A 
and (8.21) implies that 


A>X,U X,U-+: UX, 


m+) 


which is impossible. | 


It should be observed that the uniqueness of the measure of subjective 
probability is false without an extra assumption such as the one that there 
is an n-fold almost uniform partition. To see this, it suffices to consider the 
situation where X = {a,b}, & =all subsets of X, and {a,b}> 
{a} >{b} >©. Details are left to the reader. It would be very interesting to 
find, for a system (X, &, >) which has a measure of subjective probabil- 
ity, necessary and sufficient conditions for that measure to be unique.* It 
would also be interesting and practically useful (from the point of view of 
meaningfulness results) to systematically describe the types of admissible 
transformations of subjective probability measures which can arise. 


8.5.4 Necessary and Sufficient Conditions for a Subjective 
Probability Measure in the Finite Case 


In their paper, Kraft, Pratt, and Seidenberg [1959] present a set of 
conditions that are necessary and sufficient to guarantee the existence of a 
subjective probability measure on a structure (X, & , >), in the case that X 
is finite. In a later paper, Scott [1964] presents much the same conditions in 
a clearer format. These conditions are closely related to Scott’s axioms for 
difference measurement, which we discussed in Section 3.3.2, and to 
Scott’s axioms for conjoint measurement, which we discussed in Section 
5.4.5. We shall present these conditions. For other related formulations, see 
Domotor [1969], Fishburn [1969], and Krantz et al. [1971, Section 9.2.2]. 


*The theorem of Suppes and Zanotti [1976], which we present in Section 8.5.5, gives 
conditions necessary and sufficient for the existence of a unique measure of subjective 
probability for the set we shall call &* . However, the conditions given in that theorem do not 
give rise to a unique subjective probability measure on &. See Exer. 11. 
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To present the axioms, we need to use the characteristic function x , of a set 
A € &.x, isa function from X into {0, 1} such that 


-/{l if a€A, 
xa(4) iS if aE. 


(The reader will easily verify that 
XauB = Xa +XB- Xans 
and that 


Xanp ~V@PAN B=DB) 


Suppose X is a set, & an algebra of subsets of X, and > a binary 
relation on &. Then the triple (X, & , >) is a subjective probability structure 
in the sense of Scott if the following axioms are satisfied: 


AxIoM SP1. X >@ and A > © for all A & &. 
AXIOM SP2. A > B or B =A forall A, BE &. 


AXIOM SP3. Whenever A,, A>,...,4A,, By, By,...,B, © & and A; > B; 
for alli < rand 


Xa, t Xa, + °°° tX4, = XB, + Xa, + °° HXe, (8.22) 


then B, => A,. 


The reader should note that Axiom SP3 is really an infinite “schema” of 
axioms, one for each r. As in the case of difference and conjoint measure- 
ment, it does not reduce to a finite schema: elements A, may be repeated. 
Axiom SP3 is unpleasant because it is stated in terms of characteristic 
functions rather than in terms of union or other primitive notions. How- 
ever, Eq. (8.22) has a rather simple interpretation; it says that for each 
a € X,a is in exactly as many 4, as B;. 


THEOREM 8.3 (Scott [1964]). Suppose X is a finite set, & is an algebra of 
subsets of X, and > is a binary relation on &. Then there is a measure of 
subjective probability on (X, &, >) if and only if (X, &, >) is a subjective 
probability structure in the sense of Scott. 


We shall present a proof of the necessity of Scott’s axioms. The proof of 
sufficiency, like those for Scott’s Theorems 3.9 and 5.4, uses a clever 
variant of the separating hyperplane theorem. Axioms SP1 and SP2 are 
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special cases of the de Finetti axioms, and so follow from the representa- 
tion. Turning to Axiom SP3, let X¥ = {x,, x.,..., x,}. Assume that {x;} is 
in & for each i. The following argument is easy to fix up if this assumption 
does not hold, by use of minimal sets in & . We leave details of that to the 
reader. Define a linear functional L on Re* by 


L(0,..., 0, 1,0,...,0) = p({x,}), 


where there is a 1 in the ith place and p is a measure of subjective 
probability. Identify x , with the vector that has a | in the ith place if and 
only if x, is in A. Then, since p is finitely additive, 


Lx4) = & P((%}) = P(A). 
Moreover, 
L(X4, + Xa, + ° °° +xX4,) = Le, + x8, t+ °* +x—) > 
E Lxa) = 3 Le) > 
5,p(A) = 5 p(B). (8.23) 
Now A, > B, for i <r implies 
p(A,) 2 p(B), ean Pe Serene ad 


Thus, (8.23) implies p(A,) S p(B,), or B, >= A,. 

As we observed after the proof of Theorem 8.2, the measure of subjec- 
tive probability may not be unique in the finite case. Sufficient conditions 
for uniqueness in the finite case are given by Suppes [1969]. 

Extending the methods used to prove Theorem 8.3, Scott has obtained 
some conditions that are necessary and sufficient for the existence of a 
measure of subjective probability, even if & is infinite. But these condi- 
tions are rather complicated and have not been published. 

To the author’s knowledge, most people are willing to accept Scott’s first 
and second axioms, and no one has subjected Scott’s third axiom to a test. 


8.5.5 Necessary and Sufficient Conditions for a Subjective 
Probability Measure in the General Case 


Suppes and Zanotti [1976] have recently given necessary and sufficient 
conditions for the existence of a subjective probability measure without the 
assumption of finiteness. Their basic idea is to use the axioms for extensive 
measurement. This is accomplished by passing to a more general structure 
than the algebra of subsets &. 
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In particular, Suppes and Zanotti start with the correspondence between 
the elements of & and their characteristic functions, as does Scott. They 
define an extended characteristic function of & to be a finite sum of 
characteristic functions (repetitions allowed) and &* to be the set of 
extended characteristic functions of &. 

Suppose p is a measure of subjective probability for (X¥, &, >). Then we 
may define a function E on &* by defining E(A*) to be Zp(A)) if 
A* = 3x4. Thus, in particular, if A* = x4, E(A*) = p(A). We define 
>* on &* by 


At >*Bt & E(A*) > E(B*). (8.24) 


Then >* restricted to & —that is, characteristic functions of & —is just 
>. We say >* extends >. The function E obviously satisfies 


E(A*+ +B*) = E(A*) + E(B‘). (8.25) 


Hence, the triple (&*, >*, +) satisfies the necessary and sufficient condi- 
tions for extensive measurement developed in Section 3.2.2; that is, it 
satisfies the axioms for an extensive structure. Using the notation >* and 
~+ defined from >* as = and ~ were from > in Eqs. (8.12) and (8.13), 
we may write these axioms as follows: 


Axiom SZ1. For all A+, B*,C* € &*, 
At +(B* +C*)~*(At +B*)+C?%. 

AXIOM SZ2. (&*, >*) is a strict weak order. 

AXIOM SZ3. For all A+, B*,C* € &*, 

A*>*Bt @A*t+ Ct>*Bt + Cte ct+ At>*C* + Bt. 

AXIOM SZA. For all A*, B*, C*, D* € &*, if A* >*B*, then there is 
@ positive integer n such that 

nA* + C*>*nB* + D*. 

In addition, it is clear that the following axioms also hold: 

AxIoM SZ5. xXx >*Xg- 

AxIoM SZ6. A* >*xg. 


THEOREM 8.4 (Suppes and Zanotti [1976]). Suppose X is a set, & is an 
algebra of subsets of X, > is a binary relation on &, and &* is the set of 
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extended characteristic functions of & . Then there is a measure of subjective 
probability on (X, &, >) if and only if there is a binary relation >* on &* 
extending > and such that (&*, >*, +) satisfies Axioms SZ2, SZ3, SZ4, 
SZ5, and SZ6. 


Proof. We have already shown the necessity of these axioms. To show 
their sufficiency, we note that Axiom SZ1, with = instead of ~*, follows 
from the definition of function addition. Then Axiom SZ] follows, for = 
implies ~*, since by Axiom SZ2, > * is strict weak. Now, by Theorem 
3.6, Axioms SZ1 through SZ4 imply that there is a function E on &* 
satisfying Eqs. (8.24) and (8.25). Note that E(xg) = 0, since by Eq. (8.25), 


E(x) + E(x) = E(xa + xo) = Exo). 
By Axiom SZ5 and Eq. (8.24), 
E(xx) > E(x) = 0. 


Let p(A) = E(x 4)/ E(x x). We verify that p defines a measure of subjective 
probability. Since >* extends >, Eq. (8.24) implies that 


A>B<=p(A) > p(B). 
Axiom SZ6 implies that 
P(A) = E(x 4)/E(xx) 2 E(xg)/E(xx) = 0. 
Also, 
P(X) = E(xx)/E(xx) = 1. 
Finally, if A ( B = ©, then 
P(A VU B) = E(x ays)/E(xx) 
= E(x, + xg)/E(xx) 
= E(x 4)/E(xx) + E(xs)/E(xx) 
= p(A) + p(B). 
Ei 


One unpleasant aspect of the Suppes—Zanotti result is that it speaks in 
terms of the existence of an extension to a larger set &*, and so is not 
explicitly a representation theorem in terms of (X, &, >) alone. To the 
author’s knowledge, it remains an open problem to state nice, necessary 
and sufficient conditions for the existence of a subjective probability 
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measure which are stated in terms of (X, &, >) alone, and which do not 
require the assumption of finiteness. 


Exercises 


1. Show that if (X, &, >) has a measure of subjective probability and 
if 


{a,e, f} > {b,e} 


{b, c, e} ~ {a,e, ¢, d}, 
then 


{f} >{4}. 


2. Suppose X = {A, B, C, D, E}, (X, &) is an algebra of subsets, > is 
a binary relation on &, and it is known that 
(a) {A} ~ {E}, 
(B) {B} ~ {C}, 
(8) (A, E} ~ {B}. 
(a) Show that these hypotheses imply the following: 
(i) If there is a measure p of subjective probability, then 


P(D) > p(A) + p(B). 


(ii) Moreover, the measure p is determined uniquely. 
(b) However, show that the conditions (a), (8), and (y) do not 
determine p uniquely. 


3. (a) Suppose X = {a,b}, & = all subsets of X, and > is a strict 
weak order on & defined by 


{a,b} >{a}>{b}>@. 


Show that there is a continuum of subjective probability measures on 
(X, &, >). 

(b) Use one of Scott’s axioms to show that, since {a}>{b} and 
{a, b} >{a}, we must have {a, b} > {b}. 

4. Raiffa [1968] discusses the proportion A of medical doctors who are 
non-teetotalers and consume more scotch than bourbon. This exercise is 
based on some of his discussion of subjective estimates of the probability 
that A is at least or at most a certain amount. Let X be the set of all 
outcomes A = r for r & [0, 1]. Let & be the family of the following subsets 
of X: 

24) 


A, which says that A 2 .6 
B, which says that A < 6 
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C, which says that A 2 .9 
D, which says that \ < 9 
E, which says that A < 6 orA 2 9 
F, which says that A 2 .6 andA < 9 
X, which says that A 2 0 andA S$ 1. 
(a) Observe that & is an algebra of subsets of X. 
(b) The comparative probability relation > on & is the strict weak 
order defined by 


X>D>E>A~B>F>C>G@. 


Does there exist a measure of subjective probability on (X, &, >)? 
(c) If the answer to part (b) is yes, is this measure unique? 
5. Show that under the de Finetti axioms, if A ¢ B, then it follows that 
B > A, but not necessarily that B >A. 
6. Prove the following results from the de Finetti axioms: 
(a) If A>@ and An B =@, then A U B>B. 
(b) If A > B, then B° > A’. 


(c) f¥ A> B,C>D,and AN C=@,thnAUC>BUD. 
(d) fA U B>CUDandC ND =Z@, then A > Cor Bs D. 


(e) If B > B’ and C° > C, then B>C. 

(f) If A ~@ and BSA, then B~@. 

(g) If A ~@ and B~ @, then AQ B~@Mand AU B~SD. 
(h) If A ~ D, then A° >. 


7. Suppose (X, &, >) is as in Exer. 3. 
(a) Verify Scott’s axioms directly. 
(b) Identify the extended characteristic functions &* and the binary 
relation >* on &* defined in Section 8.5.5. 


8. Let X = {1,2,...,”} and let & be all subsets of X. Define > on 
& by 


A>Be|A| > |BI. 


Show that (X, &, >) has a measure of subjective probability. 


9. Consider the uniqueness of the measure of subjective probability in 
Exer. 8. 


10. (a) If (X, &, >) is as in Exer. 3, show that there is no n-fold almost 
uniform partition. 
(b) Is there such a partition for (X, &, >) as in Exer. 4? 
(c) What about for (X, &, >) of Exer. 8? 


11. Given (X, &, >), suppose >* extends > to &* in sucha way that 
Axioms SZ2 through SZ6 are satisfied. 
(a) Show that there is a unique function E on &* satisfying Eqs. 
(8.24) and (8.25) and the condition E(x,) = 1. 
(b) Show, however, that this does not imply that there is a unique 
subjective probability measure on (X, &, >). 
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12. Prove transitivity of = directly from Scott’s Axiom SP3 by observ- 
ing that for all A, B, C, 


Xa XB + XC =XBtXC +X: 


13. Show that the de Finetti axioms are independent. 

14. Luce [1967] has given conditions on (X, &, >) sufficient for the 
existence of a measure of subjective probability, which do not require that 
& be infinite. (For a discussion, see Krantz et. al. [1971, Section 5.2.3].) 
Exercises 14 to 18 discuss Luce’s conditions. These conditions are the three 
de Finetti axioms and two additional axioms. The first is the following. 


We say that a sequence 4,,...,A,,... from & isa standard sequence 
relative to A in & if fori=1,2,..., there exist B,, C, in & such that 

(1) A, = B, and B,~ A, 

(2) B.1N C, =, 

(3) B; ~ Ais 

(4) C;~ A; 

(5) Ain. = BU CG. 


ARCHIMEDIAN AXIOM: If A >, then every standard sequence relative to 
A is finite. 


Show that this axiom is a necessary condition for the existence of a 
measure of subjective probability and that it does indeed express an 
Archimedean condition. 


15. Luce’s second additional axiom is the following: 


Axiom A. Suppose that (X, &, >) satisfies the de Finetti axioms. 
Suppose A, B, C, D are in &, A 1 B =@, A>C, and B > D. Then there 
exist C’, D', E in & such that 

(l) E~AUD, 

(2) C’n D’ =@, 

3) E2C’UD,, 

(4) C’ ~ C and D' ~ D. 


(As Krantz et al, [1971] point out, Axiom A is difficult to explain without 
simply restating it in words. We leave this restatement to the reader.) 
Axiom A is satisfied by some finite structures (X, &, >). For example, let 
X = {a, b,c, d}, let & be all subsets of X, and let P(a) = P(b) = P(c) = 
0.2, P(d) = 0.4. Show that Axiom A is satisfied. (Luce proves that the 
de Finetti axioms, the Archimedean Axiom, and Axiom A are sufficient to 
prove the existence of a measure of subjective probability.) 
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16. Show that Axiom A is not a necessary condition for the existence of 
a measure of subjective probability. (For example, any structure 
(X, &, >) whose equivalence classes under ~ fail to form a single 
standard sequence violates Axiom A.) 


17. Show that if (X, >) is a strict simple order, then Luce’s axioms 
imply that & is infinite. 

18. Show that the structure (X, &, >) of Exer. 8 satisfies Luce’s Axiom 
A. 
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Decibel, 150 
Decisionmaking, xix, xx, 5-6 
group, 6, 118-119, 273, 297-298, 377-379 
individual, 6 
under risk or uncertainty, 305-368 
using the EU Rule and Hypothesis, 
312-335 
Decomposability, 232-235 
non-decomposability, 235-236 
Decomposable relation, 232 
Decreasing marginal utility, 185 (see also 
Utility, marginal) 
de Finetti axioms, 386-389, 390, 391, 394, 
399 
defined, 388 
Degree 
of certainty, 372 
of conviction, 372 
of hazard, 90 
of precision, 138 
Dense set, 112 
Density, measurement of, 3, 76, 77, 78, 79, 
232 
Denumerable set, 109 
Department of Public Health, State of Cali- 
fornia, 152, 194 
Derived measurement, 2, 3, 49, 76 ff., 174, 
274 
Derived scale, 77 
Descriptive axioms (rules), 3, 4, 102, 309 
Deterministic theories, 7 
Developmental psychology, 270 
Diagonal vector, 208 
Difference measurement, 134-146, 189 
algebraic, 135, 189 
necessary and sufficient conditions for, 
139 
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sufficient conditions for, 139 
uniqueness theorem for, 141 
Difference scale, 66 
Difference structure, equally spaced, 145 
Dimension 
of interval order, 268 
of semiorder, 268 
of (strict) partial order, 39-40, 202-203, 
271 
Dimensional parameter, 76-77, 79, 165, 174 
Dimensionality, reducing, 203-206, 336-337 
Discomfort index, 212 
Discount rate, 206 
Distributive cancellation, 237 
Distributive laws, 231 
Distributive rule, 233, 237 
Dominance condition, 207 
Double cancellation, 220, 221, 222, 237 
Drive, 198, 211, 220 
Dual distributive rule, 237 
Duration, 149 


Ecosystems, 270 
Effect factor, 89 
Elections, 6 
Energy 
demand for, 82-83 
magnitude estimation for, 83 
and environmental systems, 375 
use of, 82-83 
Entropy, 174 
Environmental episodes, 
371-372, 374 
Environmental Protection Agency (U.S.), 
89, 99, 150, 194 
e-betweenness, 265 
Equal loudness contour, 150, 151 
Equal sensation function, 182, 183, 184 
Equally spaced difference structure, 145 
Equally spaced (product) structure, 222 
Equivalence class, 25 
representative of, 25 
Equivalence relation, 15, 25-28 
Error theory, 104 
Essential (Component), 218 
Essentiality, 218, 229-230 
EU Hypothesis (see Expected Utility Hy- 
pothesis) 
EU Rule (see Expected Utility Rule) 
EV Hypothesis (see Expected Value Hy- 
pothesis) 
Event, 370 
Expected utility, 306, 383 
defined, 306, 307 


probabilities of, 


413 


Savage axioms for, 382-383, 385 
Expected Utility Hypothesis, 305-312, 314, 
329, 338, 380 
in decisionmaking, 312-335 
defined, 309 
meaningfulness of decisions using, 314- 
315 
representation theorem for, 352, 356 
subjective, 309 
tests of, 316-317, 322 
Expected Utility Rule, 305-312, 314 
in decisionmaking, 312-335 
defined, 309 
meaningfulness of decisions using, 310, 
314-315 
tests of, 316-317 
Expected utility representation, semiordered 
version, 265 
Expected value, 306, 314 
Expected Value Hypothesis, 314, 338, 339, 
343, 344 
representation theorem for, 352, 356 
Experts’ ratings, 82-83, 95-97 
Extending 
comparative probabilities, 395 
preferences over consequences to prefer- 
ences over lotteries, 318, 383 
Extensive attribute, 122 
Extensive measurement, 122-134, 230, 231, 
394, 395 
defined, 122 
necessary and sufficient conditions for, 
126-129, 133 
semiordered version, 264 
sufficient conditions for (see Holder’s The- 
orem) 
uniqueness theorem for, 129-130, 391 
where combination is restricted, 126 
Extensive structure, 128, 230, 395 


Fechner’s Law, 153, 154, 277 
Fechnerian utility model, 153, 277-279, 280, 
281, 283, 284, 285, 297 
defined, 277 
Fermat, 370 
Finitely additive probability space, 371 
Fisher index, ideal, 93 
Fractional hypercube representation, 236 
Frequency, 149 
Function, 41—44 
Functional equation, 159 (see also Cauchy 
equations) 
Fundamental measurement, 2, 3, 49ff. 
defined, 54 
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statistical analogues of, 103 (see also Stat- 
istical analogues of measurement theo- 
ries) 


Gamble, 307 
Gambler’s fallacy, 374, 376 
Gap, 114 
lower end point of, 114 
upper end point of, 114 
Generalized ratio scale, 282 
Genetics, 270 
Geometrical laws, 174 
Geometric mean, 81-84, 91 
defined, 82 
Group, 122 
Group consensus, 118 
Group decisionmaking, 6, 
297~298, 377-379 
Group preferences, 118-119 
Guttman scales, 238-239, 240 
generalization of, 271 


118-119, 273, 


Habit strength, 198, 211 

Hardness, 64, 65 

Hasse diagram, 37, 38 

Hearing level, 158 

Hiraguchi’s Theorem, 40 

Holder’s Theorem, 122-126, 128 
stated, 124 

Homogeneous families of semiorders, 287- 

290 

Homomorphic relational systems, 52 

Homomorphism 
alternative definitions, 52, 55-56 
definition of, 52 

Hysteresis, 134 


Identity, 122 
Immediate successor, 144 
Impossible event, 369 
Incentive, 198, 211, 220 
Income streams, 312 
Inconsistency of judgments, 119, 272 (see 
also Probabilistic consistency) 
Independence, 215, 219-220, 237, 340, 345, 
347, 347 
on a component, 215 
conditions of, 234, 237~238 
for pair comparison systems, 297 
joint, 234, 237 
marginal, 338 
strong, 340-341, 345 
utility, 340 
value, 338 
Index numbers, 84-86, 177—178 (see also 
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Consumer confidence index, Consumer 
price index) 
Indifference, 28, 29, 247, 257, 268-271, 313, 
318 
nontransitivity of, 247-250, 358 
transitivity of, 248, 265, 276 
Indifference curve, 208 
Indifference graph, 269, 270, 271, 298 
Industrial Environmental Research Labora- 
tory, 89, 90, 99 
Inefficiency of a power plant, measurement 
of, 80 
Information processing, 220 
Insurance, 311, 325 
Intelligence tests, 64, 65, 81 
taw scores, 64, 65, 81 
standard scores, 64, 65, 81 


. Intensity, 149, 150 


Intersection of relations, 18 
Interval graph, 255, 270, 271 
Interval order, 253-255, 358 

defined, 254 

uniqueness of representation for, 254 
Interval scale, 51, 64 

defined, 65 

possible comparisons with, 74 
Interval topology, 121 
Investment, 312 
Inverse, 122 
IQ, 69 

average, 81 
Isomorphic relational systems, 52 
Isomorphism, 52 


Jensen’s equation, 173 

Jevons, 7 

Joint independence, 234, 237 

Joint ordering of individuals and reactions, 
239 

Joint scale of individuals and alternatives, 
238-243 


Kelvin scale, 51, 64, 65, 66 

Kinetic energy, measurement of, 230 

Kraft, Pratt, Seidenberg example, 388-389, 
390 


Laspeyres price index, 91, 92, 93, 177 

Law of exchange, 230, 231 

Left-order-dense set, 120-121 

Lexicographic ordering (on a Cartesian 
product), 119-120 

Lexicographic ordering of the plane, 35, 111, 
113, 208 

Linear order, 29 (see Simple order) 


Subject Index 


Log-interval scale, 66, 144, 175 
Lotteries, 312-317, 379-386 
expected utility of, 314 
expected value of, 314 
Louder than, 16 
transitivity of, 20 
Loudness 
binaural, 198, 212, 220, 222 (see also 
Fletcher, H., in Author Index) 
magnitude estimation of, 179 
measurement of, 42, 50, 64, 65, 66, 69, 119, 
131, 149-150, 179, 180, 181, 182, 183, 
202, 249, 266-267, 275, 288, 291, 
298-299 
Loudness summation hypothesis, 154, 212 
Luce-Tukey Theorem, 215-222 


Magnitude-cross-modality consistency con- 
dition, 187-188 
Magnitude estimation, 82, 83, 179-180, 181, 
184, 185 
leading to ratio scale, 180, 186-193 
measurement axiomatization of, 186-193 
scale type of, 83 
Magnitude-pair consistency condition, 187 
(see also Pair consistency condition) 
second, 188 
Magnitude production, 180 
Marginality assumption, 338, 339, 340, 348 
Marginal independence, 338 
Marginal utility, 325 
decreasing, 185 
Market basket, 197, 207, 219, 221, 340, 345, 
346 (see also Commodity bundle) 
Mass, measurement of, 50, 51, 52, 54, 58, 64, 
65, 124-125, 128 
MATE criteria, 89 
Mathematics, 8-9 
Mean(s), comparison of, 81-84, 91, 377 
Meaningful statement 
alternative definition, 59 
definition, 58 
in derived measurement, 79 
problems with definition of for irregular 
scales, 76 
and statistical summaries and tests, 83-84 
Measurement 
definition of, 49 ff. 
derived, 2, 3, 49, 76 ff., 174, 274 
foundations of, xix, 1, 3, 4 
fundamental, 2, 3 
without numbers, 253-255 
Medians, comparison of, 91, 377 
Medical decisionmaking, 205-206, 307-309, 
320, 336-337, 342, 373, 374, 376 
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Mental testing, 198, 212, 219, 226, 238 
Metathetic continua, 155 
Mexico City airport, 336, 378 
Midpoint, 243 
Minimum Acute Toxicity Effluent (MATE) 
Criteria, 89 
Mixture, 353 
Mixture space, 352-364 
defined, 353 
EU, 354 
representation and uniqueness theorem 
for, 356 
Moderate Stochastic Transitivity, 284, 285, 
287 
Mohs scale, 65 
Momentum, 214 
Monotonicity, 123, 127, 134, 356, 381, 386, 
387 
weak, 136 
Monotone increasing transformation, 64, 65 
MST (see Moderate Stochastic Transitivity) 
Multiattributed alternatives, 197 (see Multi- 
dimensional] alternatives) 
Multidimensional alternatives, 37, 197, 286, 
332, 336-352 
preferences among, 197, 219, 220 
Multidimensional scaling, 242 
Multidimensional utility theory, 199, 206 
Multiplicative representation, 131, 214, 344 
(see also Polynomial conjoint mea- 
surement) 


National Air Pollution Control Administra- 
tion (U.S.), 87, 99 

Negative transitivity, 15, 20 F 

Newton’s Law of Gravitation, 164, 174 

n-fold almost uniform partition, 390 

n-fold uniform partition, 390 

Noise, 149-150 

levels of, 152 

Noise pollution, 149-150 

Nominal scale, 64, 66 

Nonadditive representations, 232-238 

Nondecomposable representations, 235-236 

Nonsatiety, 208 

Nonsaturation, 208 

Nonsimple distributive conjoint measure- 
ment, 238 

Normative axioms (rules), 3, 4, 102, 309 


Observable property, 283 

Ohio environmental agency, 
24-25, 118, 266 

Ohm’s Law, 164, 174 

Open ray, 12] 


goals for, 
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Operation, 41-44 
binary, 41 
defined, 41 
Opinion scaling, 184-185 
Order-dense set, 111-112 
left, 120-121 
Order relations, 15, 28-36 
Ordinal measurement, 101-121, 240 (see also 
Utility function, ordinal; Utility func- 
tion, order-preserving) 
non-existence of, 247 
representation theorem for 
countable case, 109 
finite case, 101 
general case, 112 
uniqueness theorem for, 108 
Ordinal scale, 64, 65 
possible comparisons with, 74 
Ordinal utility measurement (see Ordinal 
measurement) 
Orientation of a relation, 270 
Outcome, 370 


Paasche price index, 91, 92, 93, 177 
Pair comparison experiment, 102, 107, 317 
pair comparison system, 273 
forced choice, 273 
discriminated, 289 
trace of, 295 
imperfect, 280 
Pair consistency condition, 187 (see also 
Magnitude-pair consistency condition) 
Pair estimation, 187 
Partial order, 15, 36-41 (see also Dimension) 
defined, 36 
strict, 15, 38 
Partial substitutability, 296 
Partition 
n-fold almost uniform, 390 
n-fold uniform, 390 


Pascal, B., 370 
Path, 28 
length of, 28 


Perfect substitutes relation, 44, 60, 61, 63 

Physical fitness, ratings of, 97 

Physical scale, 153 

Pindex, 90 

Poincare, H., 248 

Pollution index, 87-91 

Polynomial conjoint measurement, 232-235, 
383 

Pony-bicycle example, 248-249, 276, 285 

Positive element, 133 
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Positive linear transformation, 64, 65 
Positivity, 133 
Power law, 155-157, 179-186 
some exponents, 156 
Preference 
measurement of, 4, 50, 51, 64, 65, 102, 
125-126, 128, 135, 141, 202, 204, 208, 
219, 220, 240, 247, 249, 257, 273, 274, 
275, 277, 288, 309, 317 ff., 342 
among multidimensional alternatives, 37 
(see also multidimensional alternatives) 
negative transitivity of, 103 
strict, 16, 20, 28, 29, 34, 103, 238, 313, 316 
transitivity of, 317 
weak, 16, 28, 313, 318 
Pre-order, 29 
Prescriptive axiom (rule), 3, 4, 102, 309, 325 
Primitive scale, 77 
Probabilistic consistency, 272-299; in partic- 
ular 273, 289, 290 
multidimensional theory of, 286 
Probabilistic theories, 7, 8 
Probabilistic versions of measurement repre- 
sentations, 129 (see also Statistical ana- 
logues of measurement representations) 
Probability 
group assessment of, 377, 379 
objective, 369-371 
universal and replicable nature of, 370, 
372 
relation between objective and subjective, 
374-376 
subjective (see Subjective probability) 
Probability measure, 371 
finitely additive, 371 
simple, 360 
Probability space, finitely additive, 371 
Product rule, 281 
Product structure, 197-245, 336 
defined, 197-198 
numerical, 198, 206 
obtaining, 197~203 
Productivity index, 84 
Prothetic continua, 155, 156 
Proximity, judged, 242 
Psychological scale, 153 
Psychophysical function, 66, 150-157, 277 
defined, 153 
functional equations for, 165 ff. 
in log-log coordinates, 180 
Psychophysical law, 153 
possible 158-178 (tabulated, 164) 
Psychophysical problem, the, 149-158 
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Psychophysical scaling, 149-196 

Psychophysics, 16, 119, 149-196, 249, 275, 
277 

Public Health Problem, 
339-340, 345, 346, 347 

Public health question, the, 329 

Public policy, xx 

Pure tone, 150 


the, 329-334, 


Quadruple condition, 293 

Qualitative probability, 369 (see Subjective 
probability) 

measure of, 374 

Qualitative probability relation, 373 

Quantum mechanics, 353 

Quasi-additive function, 173 (see also Utility 
function, Quasi-additive) 

Quasi-additive representation, 235 

Quasi order, 15, 29 

Quotient of a relation, 31 

Quotient of a relational system, 45 


Radioactive decay (law of), 79, 174, 370 
Random utility model, 278 
Ranking, 8, 116-117 
Ratio estimation, 180, 187 
Ratio production, 180 
Ratio scale, 64, 65 
generalized, 282 
narrow sense, 79 
possible comparisons with, 74 
wide sense, 79 
Rationality, definition of, 4, 102, 272 
Reducing number of dimensions, 203-206, 
336-337 
Reduction 
and regularity, 60 ff. 
of relational system, 44-46, 61 
of strict weak order, 34 
of weak order, 31 
Reflexivity of an operation, 134 
Regular representation, 59, 63, 67, 252 
Regular scale, 57 ff. 
characterized, 60 
defined, 59 
narrow sense, 78 
reduction and, 60 ff. 
wide sense, 78 
Relation, 13-47 
antisymmetric, 15, 20 
asymmetric, 15, 19 
binary, 13 
complete, 15, 32 
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defined, 13 
equivalence, 15, 25-28 
irreflexive, 15, 19 
n-ary, 17 
negatively transitive, 15, 20 
nonreflexive, 15 
nonsymmetric, 15, 19 
nontransitive, 15 
properties of, 15, 19-25 
quaternary, 17 
reflexive, 15, 19 
restriction of, 17 
strongly complete, 15, 29 
strongly connected, 28, 29 
symmetric, 15, 19 
ternary, 17 
transitive, 15, 20 
Relational system, 16, 44-46, 51 
irreducible, 45 
numerical, 51 
reduction of, 44-46 
shrinkable, 45, 61 
type of, 51 
Relative product, 18 
Representation, 54 
axioms for, 54 
regular, 59, 63, 252 
and scale type, 67 
Representation problem, 4, 54 ff. 
Representation theorem, 54 
for derived measurement, 77 
Response strength, measurement of, 66, 198, 
211-212, 220, 232-233, 234, 280, 282 
Restricted solvability, 218, 221 
Reward, 307 
Rise time, 150 
Risk, 6, 129, 233, 305-368, 320 
Risk-averseness, 326 
R-order topology, 121 


St. Petersburg Paradox, 334 
Savage axioms, 382-383, 385 
Scale, 54 
numerical, 58 
primitive, 77 
(see also Absolute scale; difference scale; 
interval scale; log-interval scale; nomi- 
nal scale; ordinal scale; ratio scale) 
Scale type, 64 
if admissible transformations are not de- 
fined precisely, 69 
alternative classifications, 69 
narrow sense, 78-79 
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if there is no formal representation, 69, 
376-377 
problems with if scale is irregular, 68 
wide sense, 78-79 
Scoring rule, 378, 379 
Scott-Suppes Theorem, 250-252 
generalization, 255-258 
need for finiteness, 255-256 
uniqueness in, 252-253 
Scott’s Theorem, 
for conjoint measurement, 224-226 
for difference measurement, 139-140 
for subjective probability measurement, 
392-394 
Semantic differential, 93 
Semiorder, 247-272, 288, 358 
defined, 250 
dimension of, 268 
homogeneous family of, 287-290 
representation theorem, 251 
uniqueness theorem, 252-253, 258-260 
weak order associated with, 257 
Semiorder Condition 
First, 296 
Second, 296 
Weak Second, 296 
Semiordered versions of measurement repre- 
sentations, 264-265 
of conjoint measurement, 265, 268 
of the expected utility representation, 265 
of extensive measurement, 264 
Sequence dating, 255 
Serial order, 255 
Seriation, 255, 258 
SEU Hypothesis (see Subjective Expected 
Utility Hypothesis) 
Severity factor, 89 
Severity tonnage, 94-95 
Shrinkable relational system, 45, 61 
Similar systems, classes of, 77 
Similarity transformation, 64, 65 
Simple majority rule, 119 
Simple order, 15, 28-36 
defined, 29 
strict, 15, 32 
Simple polynomial, 237 
Simple probability measure, 360 
Simple scalability, 297 
Social sciences, xix, 1, 4, 5, 50 
Society, problems of, xx 
Solvability, 
for conjoint measurement, 218, 221 
for extensive measurement, 137 
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restricted, 218, 221 
weak, 133 
Sone, 64, 153 
SST (see Strong Stochastic Transitivity) 
Standard sequence 
in conjoint measurement, 217, 268 
strictly bounded, 217, 268 
in difference measurement, 137-138 
strictly bounded, 137-138 
in extensive measurement, 138 
strictly bounded, 138 
in axiomatization of magnitude estima- 
tion, 193 
strictly bounded, 193 
in subjective probability measurement, 399 
Standard weight, 138 
Statistical analogues of measurement repre- 
sentations, 103, 104, 221-222 (see also 
Probabilistic versions of measurement 
Tepresentations) 
Statistical summaries, 
83-84 
Statistical tests, meaningfulness of, 83-84 
Statistical tests of measurement representa- 
tions, 104 
Stereo speakers, rating of, 95—97, 232, 297 
Strictly follows, 253 
Strict partial order, 15, 38, 251 
dimension of, 39-40, 202-203, 271 
strict simple extension of, 39 
Strict simple extension, 39 
Strict simple order, 15, 32 
Strict utility model, 280-283, 284, 285, 294 
defined, 280 
statistical test of, 291-293 
Strict weak order, 15, 32, 33 
Strong component, 28 
Strong Stochastic Transitivity, 284, 285, 286, 
287, 288 
strong version, 296, 297 
Strong utility model, 275-277, 278, 279, 280, 
281, 283, 284, 285 
defined, 275 
Subjective Expected Utility Hypothesis, 309, 
373, 380, 381 
Subjective probability, 
369-403 
additivity of, 386 
applications of, 373 
defined, 372 
direct estimation of, 91, 374 
scale type of, 376-378 
from preferences among lotteries, 379-386 


meaningfulness of, 


129, 364-365, 
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measure of, 374 
existence of, 386-400 
insufficient conditions for (see de 
Finetti axioms) 
necessary and sufficient conditions for, 
392-397 
sufficient conditions for, 
399-400 
uniqueness of, 391, 392, 394 
measurement of, 
approaches to, 373 
semiordered version of, 265 
relation to objective probability, 374-376 
Subjective probability structure in the sense 
of Scott, 393 
Subrelation, 17 
Substitutability, 233, 296 
Symmetric complement, 22, 33 
Symptoms of diseases, 241 
Synergistic effects, 87 
Szpilrajn’s Extension Theorem, 40 


390-392, 


Taste preference, 118 

Technological breakthroughs in energy- 
environment systems, 374 

Temperature, measurement of, 50, 51, 52, 54, 
55, 58, 64, 65, 66, 102, 108-109, 134-135 

Temperature difference, judgment of, 135 

Temperature-humidity measurement, 199, 
212, 227 

Temperature-humidity index, 212, 227 

Testing, 239 (see also Mental testing) 

Thomsen condition, 216, 220 

Threshold, 249, 258, 288 

of audibility, 158 

Thurstone Case V scale, 66 

Time, measurement of, 64, 65 

Tolerance factor, 89 

Tolerance geometry, 265 

Total order, 29 (see also Simple order) 

Trace, 295 

Traffic light phasing, 270 

Transitive closure, 28 

Transitivity conditions, 283-287 (see also 
Moderate, Strong, Weak Stochastic 
Transitivity) 

Transportation systems, alternative, 17, 28, 
37, 199, 200-201, 207 


Uncertainty, 6, 305-368 
Unfolding, 239-241 
multidimensional, 240 
Union of relations, 18 
Uniqueness problem, 4, 54 ff. 
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derived measurement, 78 
Utility, 6-8, 199, 202 
additive, 383 (see also Utility function, 
additive) 
over consequences, 379-386 
differences in, 141 
expected, 306, 307, 383, (see also Expected 
utility) 
of a life, 308, 329-334 
marginal, 325 
decreasing, 185 
of money, 185, 214, 311, 320, 325-327, 334 
Utility function, 50, 129, 135, 141, 274 
additive, 7, 204, 210-211, 214, 312, 
337-340, 348 (see also Utility, additive) 
and order-preserving, 338, 339 
bilinear, 341 
cardinal, 7, 51, 141 
continuous, 8, 121, 210 
full, 317, 321, 322 
multidimensional (see Utility theory, mul- 
tidimensional) 
multilinear, 341 
non-additive, 342 
order-preserving, 50, 317, 383 
and additive, 338, 339 
calculation of, 317 ff., 325-326, 332-334 
scale type of, 321-322 
ordinal, 7, 50, 102, 105, 107, 206-210, 247, 
317(see also Ordinal measurement; Util- 
ity function, order-preserving) 
calculating, 203-206 
continuous, 121 
practical techniques for computing, 107 
quasi-additive, 235, 340-348 
definition, 341 
representation for, 342 
satisfying the EU Rule, 314 
society’s, 332-334 
Utility theory, 6-8 
multidimensional, 199, 206 (see also Multi- 
dimensional alternatives) 
examples, 336 
Utility models, probabilistic, 273-283 


Value function, 314 
additive, 338, 339, 340 
multiplicative, 344 
satisfying the EV Hypothesis, 314 
Value independence, 338 
Value of a life, 329-334 (see also Public 
Health Problem; medical decisionmak- 
ing) 
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Vibration, measurement of, 157, 186 
Visual perception, 265 
Voter’s paradox, 119, 297 


Walras, 7 
Weak associativity, 127 
Weak mapping rule, 257 
Weak order, 15, 28-36 
defined, 29 
strict, 15, 32, 33 
Weak order associated with a semiorder, 257 
Weak solvability, 133 
Weak Stochastic Transitivity, 283, 284, 285, 
286, 287, 288 
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Weak utility model, 273-274, 275, 276, 283, 
285, 287 
defined, 274 
Weakly to the right of, 29, 30, 256 
Weather, (see also Temperature, measure- 
ment of; temperature-humidity mea- 
surement) 
discomfort due to, 198, 212, 227 
forecasting of, 373, 374, 378, 379 
White noise, 150 
WST (see Weak Stochastic Transitivity) 
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